Update README.md
Browse files
README.md
CHANGED
@@ -55,6 +55,8 @@ This model represents the SFT phase of post-training, using 1.4M instruction-fol
|
|
55 |
- Top-rated conversations from OASST2 and Avoin Avustaja datasets (5K samples)
|
56 |
- Translation samples from EuroParl (1K samples)
|
57 |
|
|
|
|
|
58 |
## SFT Hyperparameters
|
59 |
|
60 |
| Hyperparameter | Value |
|
@@ -71,14 +73,25 @@ This model represents the SFT phase of post-training, using 1.4M instruction-fol
|
|
71 |
Poro 2 8B SFT shows substantial improvements in Finnish instruction-following capabilities compared to Llama 3.1 8B Instruct, while maintaining strong English performance. Note that the final Instruct model (with DPO) performs significantly better.
|
72 |
|
73 |
### Finnish Instruction Following
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
74 |
- IFEval (Finnish): 64.69 (vs 47.31 Llama 3.1 8B Instruct, vs 66.54 Poro 2 8B Instruct)
|
75 |
- MTBench (Finnish): 5.92 (vs 4.1 Llama 3.1 8B Instruct, vs 6.75 Poro 2 8B Instruct)
|
76 |
- AlpacaEval 2 (Finnish): 16.8 (vs 2.05 Llama 3.1 8B Instruct, vs 28.89 Poro 2 8B Instruct)
|
77 |
|
78 |
### English Instruction Following
|
79 |
-
|
80 |
-
|
81 |
-
|
|
|
|
|
|
|
82 |
|
83 |
**Overall**: ~16% average improvement in Finnish instruction-following benchmarks compared to Llama 3.1 8B Instruct, with maintained English performance. The additional DPO step in the Instruct model provides further improvements.
|
84 |
|
|
|
55 |
- Top-rated conversations from OASST2 and Avoin Avustaja datasets (5K samples)
|
56 |
- Translation samples from EuroParl (1K samples)
|
57 |
|
58 |
+
We release the [Poro 2 instruction collection](https://huggingface.co/datasets/LumiOpen/poro2-instruction-collection).
|
59 |
+
|
60 |
## SFT Hyperparameters
|
61 |
|
62 |
| Hyperparameter | Value |
|
|
|
73 |
Poro 2 8B SFT shows substantial improvements in Finnish instruction-following capabilities compared to Llama 3.1 8B Instruct, while maintaining strong English performance. Note that the final Instruct model (with DPO) performs significantly better.
|
74 |
|
75 |
### Finnish Instruction Following
|
76 |
+
|
77 |
+
| | Poro 2 8B SFT | Llama 3.1 8B Instruct | Poro 2 8B Instruct |
|
78 |
+
|----------------|------------------|------------------------|--------------------|
|
79 |
+
| IFEval Finnish | 64.69 | 47.31 | **66.54** |
|
80 |
+
| MTBench Finnish | 5.92 | 4.10 | **6.75** |
|
81 |
+
| AlpacaEval 2 Finnish | 16.80 | 2.05 | **28.89** |
|
82 |
+
|
83 |
+
|
84 |
- IFEval (Finnish): 64.69 (vs 47.31 Llama 3.1 8B Instruct, vs 66.54 Poro 2 8B Instruct)
|
85 |
- MTBench (Finnish): 5.92 (vs 4.1 Llama 3.1 8B Instruct, vs 6.75 Poro 2 8B Instruct)
|
86 |
- AlpacaEval 2 (Finnish): 16.8 (vs 2.05 Llama 3.1 8B Instruct, vs 28.89 Poro 2 8B Instruct)
|
87 |
|
88 |
### English Instruction Following
|
89 |
+
| | Poro 2 8B SFT | Llama 3.1 8B Instruct | Poro 2 8B Instruct |
|
90 |
+
|----------------|--------|------------------------|--------------------|
|
91 |
+
| IFEval | **79.66** | 79.48 | 79.29 |
|
92 |
+
| MTBench | 7.07 | **7.70** | 7.33 |
|
93 |
+
| AlpacaEval 2 | 29.67 | 32.70 | **35.30** |
|
94 |
+
|
95 |
|
96 |
**Overall**: ~16% average improvement in Finnish instruction-following benchmarks compared to Llama 3.1 8B Instruct, with maintained English performance. The additional DPO step in the Instruct model provides further improvements.
|
97 |
|