Translation
senyu1 commited on
Commit
cd62867
·
verified ·
1 Parent(s): e18023b

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +227 -1
README.md CHANGED
@@ -1,3 +1,229 @@
1
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
2
  license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ pipeline_tag: translation
3
+ language:
4
+ - multilingual
5
+ - en
6
+ - am
7
+ - ar
8
+ - so
9
+ - sw
10
+ - pt
11
+ - af
12
+ - fr
13
+ - zu
14
+ - mg
15
+ - ha
16
+ - sn
17
+ - arz
18
+ - ny
19
+ - ig
20
+ - xh
21
+ - yo
22
+ - st
23
+ - rw
24
+ - tn
25
+ - ti
26
+ - ts
27
+ - om
28
+ - run
29
+ - nso
30
+ - ee
31
+ - ln
32
+ - tw
33
+ - pcm
34
+ - gaa
35
+ - loz
36
+ - lg
37
+ - guw
38
+ - bem
39
+ - efi
40
+ - lue
41
+ - lua
42
+ - toi
43
+ - ve
44
+ - tum
45
+ - tll
46
+ - iso
47
+ - kqn
48
+ - zne
49
+ - umb
50
+ - mos
51
+ - tiv
52
+ - lu
53
+ - ff
54
+ - kwy
55
+ - bci
56
+ - rnd
57
+ - luo
58
+ - wal
59
+ - ss
60
+ - lun
61
+ - wo
62
+ - nyk
63
+ - kj
64
+ - ki
65
+ - fon
66
+ - bm
67
+ - cjk
68
+ - din
69
+ - dyu
70
+ - kab
71
+ - kam
72
+ - kbp
73
+ - kr
74
+ - kmb
75
+ - kg
76
+ - nus
77
+ - sg
78
+ - taq
79
+ - tzm
80
+ - nqo
81
+
82
  license: apache-2.0
83
+ ---
84
+ SSA-COMET-STL, a robust, automatic metric for MTE, built based on SSA-MTE: It receives a triplet with (source sentence, translation, reference translation), and returns a score that reflects the quality of the translation.
85
+ This model is based on an improved African enhanced encoder, [afro-xlmr-large-76L](https://huggingface.co/Davlan/afro-xlmr-large-76L).
86
+
87
+ # Paper
88
+
89
+ Coming soon
90
+
91
+ # License
92
+
93
+ Apache-2.0
94
+
95
+ # Usage (SSA-COMET)
96
+
97
+ Using this model requires unbabel-comet to be installed:
98
+
99
+ ```bash
100
+ pip install --upgrade pip # ensures that pip is current
101
+ pip install unbabel-comet
102
+ ```
103
+
104
+ Then you can use it through comet CLI:
105
+
106
+ ```bash
107
+ comet-score -s {source-inputs}.txt -t {translation-outputs}.txt -r {references}.txt --model McGill-NLP/ssa-comet-stl
108
+ ```
109
+
110
+ Or using Python:
111
+
112
+ ```python
113
+ from comet import download_model, load_from_checkpoint
114
+ model_path = download_model("McGill-NLP/ssa-comet-stl")
115
+ model = load_from_checkpoint(model_path)
116
+ data = [
117
+ {
118
+ "src": "Nadal sàkọọ́lẹ̀ ìforígbárí o ní àmì méje sóódo pẹ̀lú ilẹ̀ Canada.",
119
+ "mt": "Nadal's head to head record against the Canadian is 7–2.",
120
+ "ref": "Nadal scored seven unanswered points against Canada."
121
+ },
122
+ {
123
+ "src": "Laipe yi o padanu si Raoniki ni ere Sisi Brisbeni.",
124
+ "mt": "He recently lost against Raonic in the Brisbane Open.",
125
+ "ref": "He recently lost to Raoniki in the game Sisi Brisbeni."
126
+ }
127
+ ]
128
+ model_output = model.predict(data, batch_size=8, gpus=1)
129
+ print (model_output)
130
+ ```
131
+
132
+ # Intended uses
133
+
134
+ Our model is intended to be used for **MT evaluation**.
135
+
136
+ Given a triplet with (source sentence, translation, reference translation), it outputs a single score between 0 and 1, where 1 represents a perfect translation.
137
+
138
+ # Languages Covered:
139
+
140
+ There are 76 languages available :
141
+ - English (eng)
142
+ - Amharic (amh)
143
+ - Arabic (ara)
144
+ - Somali (som)
145
+ - Kiswahili (swa)
146
+ - Portuguese (por)
147
+ - Afrikaans (afr)
148
+ - French (fra)
149
+ - isiZulu (zul)
150
+ - Malagasy (mlg)
151
+ - Hausa (hau)
152
+ - chiShona (sna)
153
+ - Egyptian Arabic (arz)
154
+ - Chichewa (nya)
155
+ - Igbo (ibo)
156
+ - isiXhosa (xho)
157
+ - Yorùbá (yor)
158
+ - Sesotho (sot)
159
+ - Kinyarwanda (kin)
160
+ - Tigrinya (tir)
161
+ - Tsonga (tso)
162
+ - Oromo (orm)
163
+ - Rundi (run)
164
+ - Northern Sotho (nso)
165
+ - Ewe (ewe)
166
+ - Lingala (lin)
167
+ - Twi (twi)
168
+ - Nigerian Pidgin (pcm)
169
+ - Ga (gaa)
170
+ - Lozi (loz)
171
+ - Luganda (lug)
172
+ - Gun (guw)
173
+ - Bemba (bem)
174
+ - Efik (efi)
175
+ - Luvale (lue)
176
+ - Luba-Lulua (lua)
177
+ - Tonga (toi)
178
+ - Tshivenḓa (ven)
179
+ - Tumbuka (tum)
180
+ - Tetela (tll)
181
+ - Isoko (iso)
182
+ - Kaonde (kqn)
183
+ - Zande (zne)
184
+ - Umbundu (umb)
185
+ - Mossi (mos)
186
+ - Tiv (tiv)
187
+ - Luba-Katanga (lub)
188
+ - Fula (fuv)
189
+ - San Salvador Kongo (kwy)
190
+ - Baoulé (bci)
191
+ - Ruund (rnd)
192
+ - Luo (luo)
193
+ - Wolaitta (wal)
194
+ - Swazi (ssw)
195
+ - Lunda (lun)
196
+ - Wolof (wol)
197
+ - Nyaneka (nyk)
198
+ - Kwanyama (kua)
199
+ - Kikuyu (kik)
200
+ - Fon (fon)
201
+ - Bambara (bam)
202
+ - Chokwe (cjk)
203
+ - Dinka (dik)
204
+ - Dyula (dyu)
205
+ - Kabyle (kab)
206
+ - Kamba (kam)
207
+ - Kabiyè (kbp)
208
+ - Kanuri (knc)
209
+ - Kimbundu (kmb)
210
+ - Kikongo (kon)
211
+ - Nuer (nus)
212
+ - Sango (sag)
213
+ - Tamasheq (taq)
214
+ - Tamazight (tzm)
215
+ - N'ko (nqo)
216
+
217
+ # Specifically Finetuned on:
218
+ - Amharic (amh)
219
+ - Hausa (hau)
220
+ - Igbo (ibo)
221
+ - Kikuyu (kik)
222
+ - Kinyarwanda (kin)
223
+ - Luo (luo)
224
+ - Twi (twi)
225
+ - Yoruba (yor)
226
+ - Zulu (zul)
227
+ - Ewe (Ewe)
228
+ - Lingala (lin)
229
+ - Wolof (wol)