YAML Metadata
Error:
"language[2]" must only contain lowercase characters
YAML Metadata
Error:
"language[2]" with value "csb_Latn" is not valid. It must be an ISO 639-1, 639-2 or 639-3 code (two/three letters), or a special value like "code", "multilingual". If you want to use BCP-47 identifiers, you can specify them in language_bcp47.
YAML Metadata
Error:
"language[9]" must only contain lowercase characters
YAML Metadata
Error:
"language[9]" with value "liv_Latn" is not valid. It must be an ISO 639-1, 639-2 or 639-3 code (two/three letters), or a special value like "code", "multilingual". If you want to use BCP-47 identifiers, you can specify them in language_bcp47.
YAML Metadata
Error:
"language[12]" must only contain lowercase characters
YAML Metadata
Error:
"language[12]" with value "fkv_Latn" is not valid. It must be an ISO 639-1, 639-2 or 639-3 code (two/three letters), or a special value like "code", "multilingual". If you want to use BCP-47 identifiers, you can specify them in language_bcp47.
zlw-fiu
- source language name: West Slavic languages
- target language name: Finno-Ugrian languages
- OPUS readme: README.md
- model: transformer
- source language codes: dsb, cs, csb_Latn, hsb, pl, zlw
- target language codes: hu, vro, fi, liv_Latn, mdf, krl, fkv_Latn, mhr, et, sma, udm, vep, myv, kpv, se, izh, fiu
- dataset: opus
- release date: 2021-02-18
- pre-processing: normalization + SentencePiece (spm32k,spm32k)
- download original weights: opus-2021-02-18.zip
- a sentence-initial language token is required in the form of >>id<<(id = valid, usually three-letter target language ID)
- Training data:
- ces-fin: Tatoeba-train (1000000)
- ces-hun: Tatoeba-train (1000000)
- pol-est: Tatoeba-train (1000000)
- pol-fin: Tatoeba-train (1000000)
- pol-hun: Tatoeba-train (1000000)
- Validation data:
- ces-fin: Tatoeba-dev, 1000
- ces-hun: Tatoeba-dev, 1000
- est-pol: Tatoeba-dev, 1000
- fin-pol: Tatoeba-dev, 1000
- hun-pol: Tatoeba-dev, 1000
- mhr-pol: Tatoeba-dev, 461
- total-size-shuffled: 5426
- devset-selected: top 5000 lines of Tatoeba-dev.src.shuffled!
- Test data:
- newssyscomb2009.ces-hun: 502/9733
- newstest2009.ces-hun: 2525/54965
- Tatoeba-test.ces-fin: 88/408
- Tatoeba-test.ces-hun: 1911/10336
- Tatoeba-test.multi-multi: 4562/25497
- Tatoeba-test.pol-chm: 5/36
- Tatoeba-test.pol-est: 15/98
- Tatoeba-test.pol-fin: 609/3293
- Tatoeba-test.pol-hun: 1934/11285
- test set translations file: test.txt
- test set scores file: eval.txt
- BLEU-scores
Test set score Tatoeba-test.ces-fin 57.2 Tatoeba-test.ces-hun 42.6 Tatoeba-test.multi-multi 39.4 Tatoeba-test.pol-hun 36.6 Tatoeba-test.pol-fin 36.1 Tatoeba-test.pol-est 20.9 newssyscomb2009.ces-hun 13.9 newstest2009.ces-hun 13.9 Tatoeba-test.pol-chm 2.0 - chr-F-scores
Test set score Tatoeba-test.ces-fin 0.71 Tatoeba-test.ces-hun 0.637 Tatoeba-test.multi-multi 0.616 Tatoeba-test.pol-hun 0.605 Tatoeba-test.pol-fin 0.592 newssyscomb2009.ces-hun 0.449 newstest2009.ces-hun 0.443 Tatoeba-test.pol-est 0.372 Tatoeba-test.pol-chm 0.007
System Info:
- hf_name: zlw-fiu
- source_languages: dsb,cs,csb_Latn,hsb,pl,zlw
- target_languages: hu,vro,fi,liv_Latn,mdf,krl,fkv_Latn,mhr,et,sma,udm,vep,myv,kpv,se,izh,fiu
- opus_readme_url: https://object.pouta.csc.fi/Tatoeba-MT-models/zlw-fiu/opus-2021-02-18.zip/README.md
- original_repo: Tatoeba-Challenge
- tags: ['translation']
- languages: ['dsb', 'cs', 'csb_Latn', 'hsb', 'pl', 'zlw', 'hu', 'vro', 'fi', 'liv_Latn', 'mdf', 'krl', 'fkv_Latn', 'mhr', 'et', 'sma', 'udm', 'vep', 'myv', 'kpv', 'se', 'izh', 'fiu']
- src_constituents: ['dsb', 'ces', 'csb_Latn', 'hsb', 'pol']
- tgt_constituents: ['hun', 'vro', 'fin', 'liv_Latn', 'mdf', 'krl', 'fkv_Latn', 'mhr', 'est', 'sma', 'udm', 'vep', 'myv', 'kpv', 'sme', 'izh']
- src_multilingual: True
- tgt_multilingual: True
- helsinki_git_sha: a0966db6db0ae616a28471ff0faf461b36fec07d
- transformers_git_sha: 3857f2b4e34912c942694489c2b667d9476e55f5
- port_machine: bungle
- port_time: 2021-06-29-15:24
- Downloads last month
- 114
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.