jordimas commited on
Commit
f353e75
1 Parent(s): 257f741

Update 2024-09-03 model

Browse files
README.md CHANGED
@@ -49,8 +49,8 @@ print(tokenizer.detokenize(translated[0][0]['tokens']))
49
 
50
  | testset | BLEU |
51
  |---------------------------------------|-------|
52
- | test dataset (from train/dev/test) | 74.1 |
53
- | Flores200 dataset | 31.4 |
54
 
55
  ## Additional information
56
  * https://github.com/Softcatala/nmt-models
 
49
 
50
  | testset | BLEU |
51
  |---------------------------------------|-------|
52
+ | test dataset (from train/dev/test) | 66.4 |
53
+ | Flores200 dataset | 32.5 |
54
 
55
  ## Additional information
56
  * https://github.com/Softcatala/nmt-models
config.json CHANGED
@@ -1,2 +1,9 @@
1
  {
2
- }
 
 
 
 
 
 
 
 
1
  {
2
+ "add_source_bos": false,
3
+ "add_source_eos": false,
4
+ "bos_token": "<s>",
5
+ "decoder_start_token": "<s>",
6
+ "eos_token": "</s>",
7
+ "layer_norm_epsilon": null,
8
+ "unk_token": "<unk>"
9
+ }
model.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:d88b49a20ced6e0581b08714285f097d132437be725ae52a9ccdc462265484b3
3
- size 70727761
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:ddca667b84f79ebd2b4b307778b7dc286b81109713d3bf908adb019e88fc433e
3
+ size 70727741
model_description.txt CHANGED
@@ -1,9 +1,9 @@
1
  Model description: glg-cat
2
- Date: 2022-11-17
3
- TF version 2.10.0, OpenNMT version 2.29.1, CTranslate2 version 2.24.0
4
  Test data set
5
- BLEU|nrefs:1|case:mixed|eff:no|tok:13a|smooth:exp|version:2.1.0 = 74.1 88.8/79.3/72.2/66.0 (BP = 0.973 ratio = 0.974 hyp_len = 102921 ref_len = 105694)
6
- chrF2|nrefs:1|case:mixed|eff:yes|nc:6|nw:0|space:no|version:2.1.0 = 86.2
7
  Flores data set
8
- BLEU|nrefs:1|case:mixed|eff:no|tok:13a|smooth:exp|version:2.1.0 = 31.4 61.4/37.7/24.9/16.8 (BP = 1.000 ratio = 1.026 hyp_len = 28017 ref_len = 27304)
9
- chrF2|nrefs:1|case:mixed|eff:yes|nc:6|nw:0|space:no|version:2.1.0 = 59.9
 
1
  Model description: glg-cat
2
+ Date: 2024-09-03
3
+ TF version 2.10.0, OpenNMT version 2.29.1, CTranslate2 version 3.22.0
4
  Test data set
5
+ BLEU|nrefs:1|bs:1000|seed:12345|case:mixed|eff:no|tok:13a|smooth:exp|version:2.1.0 = 66.4 (μ = 66.4 ± 1.1) 82.3/70.6/62.8/56.7 (BP = 0.985 ratio = 0.985 hyp_len = 202651 ref_len = 205716)
6
+ chrF2|nrefs:1|bs:1000|seed:12345|case:mixed|eff:yes|nc:6|nw:0|space:no|version:2.1.0 = 80.9 (μ = 80.9 ± 0.9)
7
  Flores data set
8
+ BLEU|nrefs:1|bs:1000|seed:12345|case:mixed|eff:no|tok:13a|smooth:exp|version:2.1.0 = 32.5 (μ = 32.5 ± 1.0) 62.4/38.9/26.0/17.7 (BP = 1.000 ratio = 1.023 hyp_len = 27945 ref_len = 27304)
9
+ chrF2|nrefs:1|bs:1000|seed:12345|case:mixed|eff:yes|nc:6|nw:0|space:no|version:2.1.0 = 60.7 (μ = 60.7 ± 0.7)
shared_vocabulary.json ADDED
The diff for this file is too large to render. See raw diff
 
shared_vocabulary.txt DELETED
The diff for this file is too large to render. See raw diff
 
sp_m.model CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:ed20468df719c191291ee60835e235af1bc7a67a6a270ae936bd4ec868f81c98
3
- size 1136059
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:4af48717d42260723d8218a562a83cf22af67e3ab816bcb8ba4785ab8c06990a
3
+ size 1144913