davidkim205 commited on
Commit
b7c0c30
·
verified ·
1 Parent(s): 642507c

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +5 -2
README.md CHANGED
@@ -1,3 +1,7 @@
 
 
 
 
1
  # keval-2-1b
2
  keval-2-1b is an advanced evaluation model specifically designed to assess Korean language models using a LLM-as-a-judge approach. It is a departure from the traditional method which utilized chatgpt for evaluations. keval leverages the Gemma2-9b architecture, enhanced through SFT (Supervised Fine-Tuning) and DPO (Direct Policy Optimization). This model is trained on the newly developed Ko-bench dataset, inspired by MT-bench, tailored for Korean linguistic nuances.
3
 
@@ -161,5 +165,4 @@ The `score` column represents the ratio of correctly predicted labels to the tot
161
  |---:|:-----------|:---------|:--------|---------:|:-----------|:----------|:-----------|----:|:-----------|:----------|:----------|:----------|:----------|:----------|:-----------|
162
  | 0 | keval-2-9b | 0 (0.0%) | 50.0% | 22 | 1 (50.0%) | 1 (50.0%) | 2 (100.0%) | 0 | 2 (100.0%) | 0 | 0 | 1 (50.0%) | 1 (50.0%) | 1 (50.0%) | 2 (100.0%) |
163
  | 1 | keval-2-3b | 0 (0.0%) | 45.5% | 22 | 2 (100.0%) | 1 (50.0%) | 0 | 0 | 2 (100.0%) | 1 (50.0%) | 0 | 1 (50.0%) | 1 (50.0%) | 0 | 2 (100.0%) |
164
- | 2 | keval-2-1b | 0 (0.0%) | 36.4% | 22 | 0 | 1 (50.0%) | 2 (100.0%) | 0 | 1 (50.0%) | 0 | 1 (50.0%) | 0 | 0 | 1 (50.0%) | 2 (100.0%) |
165
-
 
1
+ ---
2
+ language:
3
+ - ko
4
+ ---
5
  # keval-2-1b
6
  keval-2-1b is an advanced evaluation model specifically designed to assess Korean language models using a LLM-as-a-judge approach. It is a departure from the traditional method which utilized chatgpt for evaluations. keval leverages the Gemma2-9b architecture, enhanced through SFT (Supervised Fine-Tuning) and DPO (Direct Policy Optimization). This model is trained on the newly developed Ko-bench dataset, inspired by MT-bench, tailored for Korean linguistic nuances.
7
 
 
165
  |---:|:-----------|:---------|:--------|---------:|:-----------|:----------|:-----------|----:|:-----------|:----------|:----------|:----------|:----------|:----------|:-----------|
166
  | 0 | keval-2-9b | 0 (0.0%) | 50.0% | 22 | 1 (50.0%) | 1 (50.0%) | 2 (100.0%) | 0 | 2 (100.0%) | 0 | 0 | 1 (50.0%) | 1 (50.0%) | 1 (50.0%) | 2 (100.0%) |
167
  | 1 | keval-2-3b | 0 (0.0%) | 45.5% | 22 | 2 (100.0%) | 1 (50.0%) | 0 | 0 | 2 (100.0%) | 1 (50.0%) | 0 | 1 (50.0%) | 1 (50.0%) | 0 | 2 (100.0%) |
168
+ | 2 | keval-2-1b | 0 (0.0%) | 36.4% | 22 | 0 | 1 (50.0%) | 2 (100.0%) | 0 | 1 (50.0%) | 0 | 1 (50.0%) | 0 | 0 | 1 (50.0%) | 2 (100.0%) |