Spaces:

Bazsalanszky
/

HunEval

Sleeping

Bazsalanszky commited on Aug 3

Commit

5c64d66

•

1 Parent(s): d02447f

Change description

Files changed (1) hide show

src/about.py CHANGED Viewed

@@ -30,7 +30,14 @@ TITLE = """<h1 align="center" id="space-title">HunEval leaderboard</h1>"""
 # What does your leaderboard evaluate?
 INTRODUCTION_TEXT = """
-This leaderboard evaluates the performance of models on the HunEval benchmark. The goal of this benchmark is to evaluate the performance of models on tasks that require a good understanding of the Hungarian language. The benchmark has two key parts. The first one aims to capture the language understanding capabilities of the model, while the second one focuses on the knowledge of the model. The benchmark is divided into several tasks, each evaluating a different aspect of the model's performance. While designing the benchmark, we aimed to create tasks that very is if not obvious, for a native Hungarian speaker, or someone that has lived in Hungary for a long time, but might be challenging for a model that has not been trained on Hungarian data. This means if a model was trained on Hungarian data, it should perform well on the benchmark, but if it was not, it should might struggle.
 """
 # Which evaluations are you running? how can people reproduce what you have?

 # What does your leaderboard evaluate?
 INTRODUCTION_TEXT = """
+The HunEval leaderboard assesses the performance of models on a benchmark designed to evaluate their proficiency in understanding the Hungarian language and its nuances. The benchmark comprises two
+primary components: (1) linguistic comprehension tasks, which aim to gauge a model's ability to interpret and process Hungarian text; and (2) knowledge-based tasks that examine a model's familiarity
+with Hungarian cultural and linguistic phenomena. The benchmark is comprised of multiple sub-tasks, each targeting a distinct aspect of the model's performance.
+In designing the benchmark, our objective was to create challenges that would be intuitive for native Hungarian speakers or individuals with extensive exposure to the language, but potentially more
+demanding for models without prior training on Hungarian data. As such, we anticipate that models trained on Hungarian datasets will perform well on the benchmark, whereas those lacking this experience
+may encounter difficulties. Notwithstanding, a model's strong performance on the benchmark does not imply expertise in a specific task; rather, it indicates a proficiency in understanding Hungarian
+language and its structures.
 """
 # Which evaluations are you running? how can people reproduce what you have?