Update README.md
Browse files
README.md
CHANGED
@@ -25,12 +25,49 @@ base_model:
|
|
25 |
library_name: transformers
|
26 |
metrics:
|
27 |
- f1
|
|
|
|
|
|
|
|
|
28 |
---
|
29 |
|
30 |
# INJONGO: A Multicultural Intent Detection and Slot-filling Dataset for 16 African Languages
|
31 |
-
|
32 |
## Evaluation Comparison
|
33 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
34 |
## Language Codes
|
35 |
|
36 |
- **eng**: English
|
@@ -51,13 +88,6 @@ metrics:
|
|
51 |
- **yor**: Yoruba
|
52 |
- **zul**: Zulu
|
53 |
|
54 |
-
## Notes
|
55 |
-
|
56 |
-
- **Bold** values indicate the best performing scores in each category
|
57 |
-
- The highlighted models (AfroXLMR 76L) show the top overall performance
|
58 |
-
- Multi-lingual training generally outperforms in-language training
|
59 |
-
- Standard deviations are reported alongside average scores
|
60 |
-
- AVG doest not include english results.
|
61 |
|
62 |
### Citation
|
63 |
```
|
|
|
25 |
library_name: transformers
|
26 |
metrics:
|
27 |
- f1
|
28 |
+
tags:
|
29 |
+
- llama-factory
|
30 |
+
- full
|
31 |
+
- generated_from_trainer
|
32 |
---
|
33 |
|
34 |
# INJONGO: A Multicultural Intent Detection and Slot-filling Dataset for 16 African Languages
|
35 |
+
|
36 |
## Evaluation Comparison
|
37 |
+
|
38 |
+
Zero-Shot Performance of LLMs on Intent Detection and Slot Filling
|
39 |
+
|
40 |
+
### Intent Detection
|
41 |
+
*Evaluation based on accuracy. Average computed on five templates, and on only African languages.*
|
42 |
+
|
43 |
+
| Model | eng | amh | ewe | hau | ibo | kin | lin | lug | orm | sna | sot | swa | twi | wol | xho | yor | zul | *AVG* |
|
44 |
+
|-------|-----|-----|-----|-----|-----|-----|-----|-----|-----|-----|-----|-----|-----|-----|-----|-----|-----|-----|
|
45 |
+
| Llama 3.1 8B | 27.6 | 1.9 | 2.1 | 4.8 | 5.5 | 3.3 | 5.3 | 2.4 | 1.6 | 2.8 | 2.9 | 14.1 | 2.6 | 4.0 | 3.2 | 3.5 | 2.8 | 3.9±2.4 |
|
46 |
+
| Gemma 2 9B | 77.6 | 49.2 | 6.1 | 40.8 | 31.5 | 23.8 | 22.2 | 23.2 | 7.7 | 29.7 | 19.9 | 70.0 | 21.0 | 13.8 | 40.1 | 32.2 | 36.3 | 29.2±8.7 |
|
47 |
+
| Aya-101 13B | 65.3 | 62.9 | 13.4 | 57.8 | 56.9 | 40.4 | 27.8 | 33.9 | 20.8 | 51.2 | 43.9 | 65.9 | 27.2 | 19.7 | 58.1 | 45.9 | 53.2 | 42.4±9.1 |
|
48 |
+
| Gemma 2 27B | 79.5 | 47.2 | 6.3 | 46.5 | 36.9 | 26.7 | 27.5 | 26.1 | 5.8 | 36.7 | 25.6 | 75.5 | 21.2 | 16.4 | 50.2 | 34.8 | 44.3 | 33.0±9.6 |
|
49 |
+
| Llama 3.3 70B | 81.1 | 56.2 | 9.5 | 52.3 | 52.4 | 35.0 | 37.5 | 37.7 | 12.4 | 32.3 | 30.5 | 80.6 | 29.3 | 20.9 | 43.5 | 41.4 | 43.9 | 38.5±9.5 |
|
50 |
+
| Gemini 1.5 Pro | **81.8** | 77.9 | 24.3 | 74.8 | 65.4 | 61.5 | 54.6 | 59.3 | 39.3 | 68.6 | 51.6 | 83.2 | 47.2 | 25.6 | 76.2 | 66.8 | 68.7 | 59.1±9.6 |
|
51 |
+
| GPT-4o (Aug) | 80.9 | 76.0 | 15.1 | 80.7 | 71.8 | 64.7 | 56.4 | 68.2 | 59.3 | 75.5 | 59.7 | 84.5 | 58.6 | 43.7 | 79.6 | 77.0 | 71.2 | 65.1±9.3 |
|
52 |
+
| [Gemma 2 9B IT (SFT)](https://huggingface.co/McGill-NLP/gemma-2-9b-it-Injongo-intent) | 81.2 | **83.3** | **77.1** | **89.8** | **86.7** | **78.6** | **85.8** | **83.6** | **84.6** | **87.7** | **76.8** | **88.8** | **82.6** | **85.1** | **89.1** | **87.9** | **78.9** | **84.1** |
|
53 |
+
|
54 |
+
### Slot Filling
|
55 |
+
*Evaluation based on F1-score. Average computed on five templates, and on only African languages.*
|
56 |
+
|
57 |
+
| Model | eng | amh | ewe | hau | ibo | kin | lin | lug | orm | sna | sot | swa | twi | wol | xho | yor | zul | *AVG* |
|
58 |
+
|-------|-----|-----|-----|-----|-----|-----|-----|-----|-----|-----|-----|-----|-----|-----|-----|-----|-----|-----|
|
59 |
+
| Llama 3.1 8B | 25.0 | 3.7 | 5.6 | 11.1 | 12.6 | 8.5 | 9.1 | 10.1 | 2.8 | 9.9 | 11.5 | 17.3 | 11.2 | 9.2 | 2.6 | 11.0 | 9.0 | 9.1±2.2 |
|
60 |
+
| Gemma 2 IT 9B | 34.1 | 4.5 | 0.3 | 7.4 | 10.6 | 5.0 | 6.0 | 5.6 | 0.1 | 7.3 | 10.8 | 21.2 | 2.4 | 2.6 | 2.2 | 5.2 | 8.2 | 6.2±2.9 |
|
61 |
+
| Aya-101 13B | 21.4 | 8.2 | 7.9 | 11.8 | 14.6 | 12.2 | 9.4 | 15.5 | 3.6 | 15.0 | 17.0 | 16.2 | 13.8 | 14.0 | 2.8 | 9.6 | 10.6 | 11.4±2.4 |
|
62 |
+
| Gemma 2 IT 27B | 49.8 | 15.7 | 9.5 | 24.1 | 25.2 | 21.7 | 15.2 | 28.4 | 2.6 | 29.8 | 28.0 | 40.2 | 24.3 | 23.3 | 4.5 | 28.1 | 31.0 | 22.0±5.8 |
|
63 |
+
| Llama 3.3 70B Instruct | 52.6 | 26.3 | 22.0 | 29.5 | 35.0 | 31.4 | 25.0 | 30.4 | 9.3 | 29.5 | 36.4 | 40.7 | 35.6 | 36.4 | 6.9 | 34.2 | 31.9 | 28.8±5.2 |
|
64 |
+
| Gemini 1.5 Pro | 52.8 | 15.2 | 18.7 | 31.9 | 35.8 | 34.4 | 34.9 | 34.4 | 12.2 | 36.8 | 43.0 | 37.5 | 34.5 | 34.2 | 6.9 | 33.2 | 38.6 | 30.1±6.1 |
|
65 |
+
| GPT-4o (Aug) | 55.4 | 22.8 | 19.4 | 37.8 | 38.9 | 36.4 | 33.5 | 35.3 | 13.0 | 40.2 | 40.9 | 46.5 | 40.1 | 37.9 | 10.0 | 42.4 | 37.6 | 33.3±6.0 |
|
66 |
+
| [Gemma 2 9B IT (SFT)](https://huggingface.co/McGill-NLP/gemma-2-9b-it-Injongo-slot) | **80.6** | **80.7** | **82.0** | **92.2** | **81.3** | **75.5** | **88.5** | **85.8** | **81.1** | **82.5** | **77.2** | **87.7** | **86.3** | **82.9** | **89.6** | **88.4** | **68.8** | **83.1** |
|
67 |
+
|
68 |
+
**Bold** values indicate the best performance for each language/metric.
|
69 |
+
|
70 |
+
|
71 |
## Language Codes
|
72 |
|
73 |
- **eng**: English
|
|
|
88 |
- **yor**: Yoruba
|
89 |
- **zul**: Zulu
|
90 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
91 |
|
92 |
### Citation
|
93 |
```
|