AudreyVM commited on
Commit
0d35540
·
verified ·
1 Parent(s): bf18ece

update evals

Browse files

reevaluated full model following discovery & fix of misaligned Aranese

Files changed (1) hide show
  1. README.md +70 -72
README.md CHANGED
@@ -560,13 +560,16 @@ This section presents the evaluation metrics for English translation tasks.
560
  | | Bleu↑ | Ter↓ | ChrF↑ | Comet↑ | Comet-kiwi↑ | Bleurt↑ | MetricX↓ | MetricX-QE↓ |
561
  |:---------------------------------|-------:|------:|-------:|--------:|-------------:|---------:|----------:|-------------:|
562
  | **EN-XX** | | | | | | | | |
563
- | SalamandraTA-7b-instruct | **36.29** | **50.62** | 63.3 | **0.89** | **0.85** | **0.79** | **1.02** | **0.94** |
564
- | MADLAD400-7B-mt | 35.73 | 51.87 | **63.46** | 0.88 | **0.85** | **0.79** | 1.16 | 1.1 |
565
- | SalamandraTA-7b-base | 34.99 | 52.64 | 62.58 | 0.87 | 0.84 | 0.77 | 1.45 | 1.23 |
566
  | **XX-EN** | | | | | | | | |
567
- | SalamandraTA-7b-instruct | **44.69** | **41.72** | 68.17 | **0.89** | 0.85 | **0.8** | **1.09** | **1.11** |
568
- | SalamandraTA-7b-base | 44.12 | 43 | **68.43** | **0.89** | 0.85 | **0.8** | 1.13 | 1.22 |
569
- | MADLAD400-7B-mt | 43.2 | 43.33 | 67.98 | **0.89** | **0.86** | 0.8 | 1.13 | 1.15 |
 
 
 
570
 
571
 
572
  <img src="./images/bleu_en.png" alt="English" width="100%"/>
@@ -584,13 +587,14 @@ This section presents the evaluation metrics for Spanish translation tasks.
584
  | | Bleu↑ | Ter↓ | ChrF↑ | Comet↑ | Comet-kiwi↑ | Bleurt↑ | MetricX↓ | MetricX-QE↓ |
585
  |:---------------------------------|-------:|------:|-------:|--------:|-------------:|---------:|----------:|-------------:|
586
  | **ES-XX** | | | | | | | | |
587
- | SalamandraTA-7b-instruct | **23.67** | **65.71** | 53.55 | **0.87** | 0.82 | **0.75** | **1.04** | **1.05** |
588
- | MADLAD400-7B-mt | 22.48 | 68.91 | **53.93** | 0.86 | **0.83** | **0.75** | 1.09 | 1.14 |
589
- | SalamandraTA-7b-base | 21.63 | 70.08 | 52.98 | 0.86 | **0.83** | 0.74 | 1.24 | 1.12 |
590
  | **XX-ES** | | | | | | | | |
591
- | SalamandraTA-7b-instruct | **25.56** | **62.51** | 52.69 | **0.85** | 0.83 | 0.73 | **0.94** | **1.33** |
592
- | MADLAD400-7B-mt | 24.85 | 61.82 | **53** | **0.85** | **0.84** | **0.74** | 1.05 | 1.5 |
593
- | SalamandraTA-7b-base | 24.71 | 62.33 | 52.96 | **0.85** | **0.84** | 0.73 | 1.06 | 1.37 |
 
594
 
595
 
596
  <img src="./images/bleu_es.png" alt="English" width="100%"/>
@@ -610,13 +614,14 @@ This section presents the evaluation metrics for Catalan translation tasks.
610
  | | Bleu↑ | Ter↓ | ChrF↑ | Comet↑ | Comet-kiwi↑ | Bleurt↑ | MetricX↓ | MetricX-QE↓ |
611
  |:---------------------------------|-------:|------:|-------:|--------:|-------------:|---------:|----------:|-------------:|
612
  | **CA-XX** | | | | | | | | |
613
- | MADLAD400-7B-mt | **29.37** | 59.01 | **58.47** | **0.87** | **0.81** | **0.77** | **1.08** | 1.31 |
614
- | SalamandraTA-7b-instruct | 29.23 | **58.32** | 57.76 | **0.87** | **0.81** | **0.77** | **1.08** | **1.22** |
615
- | SalamandraTA-7b-base | 29.06 | 59.32 | 58 | **0.87** | **0.81** | 0.76 | 1.23 | 1.28 |
616
  | **XX-CA** | | | | | | | | |
617
- | SalamandraTA-7b-instruct | **33.64** | **54.49** | 59.03 | **0.86** | 0.8 | **0.75** | **1.07** | **1.6** |
618
- | MADLAD400-7B-mt | 33.02 | 55.01 | 59.38 | **0.86** | **0.81** | **0.75** | 1.18 | 1.79 |
619
- | SalamandraTA-7b-base | 32.75 | 55.78 | **59.42** | **0.86** | **0.81** | **0.75** | 1.17 | 1.63 |
 
620
 
621
 
622
  <img src="./images/bleu_ca.png" alt="English" width="100%"/>
@@ -634,13 +639,13 @@ This section presents the evaluation metrics for Galician translation tasks.
634
  | | Bleu↑ | Ter↓ | ChrF↑ | Comet↑ | Comet-kiwi↑ | Bleurt↑ | MetricX↓ | MetricX-QE↓ |
635
  |:---------------------------------|-------:|------:|-------:|--------:|-------------:|---------:|----------:|-------------:|
636
  | **GL-XX** | | | | | | | | |
637
- | SalamandraTA-7b-instruct | **28.13** | **59.68** | **56.94** | **0.87** | **0.85** | **0.76** | **1.08** | **1.2** |
638
- | SalamandraTA-7b-base | 27.47 | 61.39 | **56.96** | **0.87** | 0.82 | 0.76 | 1.23 | 1.29 |
639
- | MADLAD400-7B-mt | 26.43 | 64.3 | 55.99 | 0.86 | **0.85** | 0.76 | 1.35 | 2.06 |
640
  | **XX-GL** | | | | | | | | |
641
- | SalamandraTA-7b-instruct | **30.94** | **55.24** | **57.69** | **0.86** | **0.85** | **0.7** | **0.9** | **1.38** |
642
- | SalamandraTA-7b-base | 28.22 | 59.52 | 56.28 | 0.85 | 0.82 | 0.69 | 1.27 | 1.78 |
643
- | MADLAD400-7B-mt | 27.77 | 59.46 | 54.92 | 0.84 | **0.85** | 0.67 | 1.42 | 2.72 |
644
 
645
  <img src="./images/bleu_gl.png" alt="English" width="100%"/>
646
 
@@ -658,13 +663,13 @@ This section presents the evaluation metrics for Basque translation tasks.
658
  | | Bleu↑ | Ter↓ | ChrF↑ | Comet↑ | Comet-kiwi↑ | Bleurt↑ | MetricX↓ | MetricX-QE↓ |
659
  |:---------------------------------|-------:|------:|-------:|--------:|-------------:|---------:|----------:|-------------:|
660
  | **EU-XX** | | | | | | | | |
661
- | SalamandraTA-7b-instruct | **22.99** | **65.8** | 52.06 | **0.86** | **0.84** | **0.74** | **1.13** | **1.38** |
662
- | SalamandraTA-7b-base | 22.87 | 67.38 | **52.19** | **0.86** | 0.79 | **0.74** | 1.19 | 1.61 |
663
- | MADLAD400-7B-mt | 21.26 | 69.75 | 49.8 | 0.85 | 0.82 | 0.72 | 1.54 | 2.71 |
664
  | **XX-EU** | | | | | | | | |
665
- | SalamandraTA-7b-instruct | **17.5** | **73.13** | 54.67 | **0.85** | **0.83** | **0.8** | **0.85** | **1.03** |
666
- | SalamandraTA-7b-base | 17.01 | 75.92 | **55.22** | **0.85** | 0.77 | **0.8** | 1.04 | 1.17 |
667
- | MADLAD400-7B-mt | 13.64 | 85.01 | 50.96 | 0.82 | 0.8 | 0.78 | 2.09 | 3.58 |
668
 
669
 
670
  <img src="./images/bleu_eu.png" alt="English" width="100%"/>
@@ -682,20 +687,20 @@ against [Transducens/IbRo-nllb](https://huggingface.co/Transducens/IbRo-nllb) [(
682
 
683
  #### English-XX
684
 
685
- | | Source | Target | Bleu↑ | Ter↓ | ChrF↑ |
686
- |:---------------------------------|:---------|:---------|-------:|-------:|-------:|
687
- | SalamandraTA-7b-instruct | en | ast | **31.49** | **54.01** | **60.65** |
688
- | SalamandraTA-7b-base | en | ast | 26.4 | 64.02 | 57.35 |
689
- | nllb-200-3.3B | en | ast | 22.02 | 77.26 | 51.4 |
690
- | Transducens/IbRo-nllb | en | ast | 20.56 | 63.92 | 53.32 |
691
  | | | | | | |
692
- | SalamandraTA-7b-instruct | en | arn | **13.04** | **87.13** | **37.56** |
693
- | SalamandraTA-7b-base | en | arn | 8.36 | 90.85 | 34.06 |
694
- | Transducens/IbRo-nllb | en | arn | 7.63 | 89.36 | 33.88 |
695
  | | | | | | |
696
- | SalamandraTA-7b-instruct | en | arg | **20.43** | **65.62** | **50.79** |
697
- | SalamandraTA-7b-base | en | arg | 12.24 | 73.48 | 44.75 |
698
- | Transducens/IbRo-nllb | en | arg | 14.07 | 70.37 | 46.89 |
 
699
 
700
  </details>
701
 
@@ -705,23 +710,19 @@ against [Transducens/IbRo-nllb](https://huggingface.co/Transducens/IbRo-nllb) [(
705
 
706
  #### Spanish-XX
707
 
708
- | | Source | Target | Bleu↑ | Ter↓ | ChrF↑ |
709
- |:---------------------------------|:---------|:---------|-------:|-------:|-------:|
710
- | SalamandraTA-7b-instruct | es | ast | **21.28** | **68.11** | **52.73** |
711
- | SalamandraTA-7b-base | es | ast | 17.65 | 75.78 | 51.05 |
712
- | Transducens/IbRo-nllb | es | ast | 16.79 | 76.36 | 50.89 |
713
- | SalamandraTA-2B | es | ast | 16.68 | 77.29 | 49.46 |
714
- | nllb-200-3.3B | es | ast | 11.85 | 100.86 | 40.27 |
715
  | | | | | | |
716
- | SalamandraTA-7b-base | es | arn | **29.19** | **71.85** | **49.42** |
717
- | Transducens/IbRo-nllb | es | arn | 28.45 | 72.56 | 49.28 |
718
- | SalamandraTA-7b-instruct | es | arn | 26.82 | 74.04 | 47.55 |
719
- | SalamandraTA-2B | es | arn | 25.41 | 74.71 | 47.33 |
720
  | | | | | | |
721
- | Transducens/IbRo-nllb | es | arg | **59.75** | **28.01** | **78.73** |
722
- | SalamandraTA-7b-base | es | arg | 53.96 | 31.51 | 76.08 |
723
- | SalamandraTA-7b-instruct | es | arg | 47.54 | 36.57 | 72.38 |
724
- | SalamandraTA-2B | es | arg | 44.57 | 37.93 | 71.32 |
725
 
726
  </details>
727
 
@@ -733,23 +734,20 @@ against [Transducens/IbRo-nllb](https://huggingface.co/Transducens/IbRo-nllb) [(
733
  #### Catalan-XX
734
 
735
 
736
- | | Source | Target | Bleu↑ | Ter↓ | ChrF↑ |
737
- |:---------------------------------|:---------|:---------|-------:|-------:|-------:|
738
- | SalamandraTA-7b-instruct | ca | ast | **27.86** | **58.19** | 57.98 |
739
- | SalamandraTA-7b-base | ca | ast | 26.11 | 63.63 | **58.08** |
740
- | SalamandraTA-2B | ca | ast | 25.32 | 62.59 | 55.98 |
741
- | Transducens/IbRo-nllb | ca | ast | 24.77 | 61.60 | 57.49 |
742
- | nllb-200-3.3B | ca | ast | 17.17 | 91.47 | 45.83 |
743
  | | | | | | |
744
- | SalamandraTA-7b-base | ca | arn | **31.76** | **53.70** | **60.71** |
745
- | Transducens/IbRo-nllb | ca | arn | 31.21 | 54.30 |60.30 |
746
- | SalamandraTA-7b-instruct | ca | arn | 30.89 | 54.70 | 59.78 |
747
- | SalamandraTA-2B | ca | arn | 27.67 | 57.05 | 58.1 |
748
  | | | | | | |
749
- | Transducens/IbRo-nllb | ca | arg | **24.44** | **60.79** | **55.51** |
750
- | SalamandraTA-7b-base | ca | arg | 22.53 | 62.37 | 54.32 |
751
- | SalamandraTA-7b-instruct | ca | arg | 21.62 | 63.38 | 53.01 |
752
- | SalamandraTA-2B | ca | arg | 18.6 | 65.82 | 51.21 |
753
 
754
  </details>
755
 
 
560
  | | Bleu↑ | Ter↓ | ChrF↑ | Comet↑ | Comet-kiwi↑ | Bleurt↑ | MetricX↓ | MetricX-QE↓ |
561
  |:---------------------------------|-------:|------:|-------:|--------:|-------------:|---------:|----------:|-------------:|
562
  | **EN-XX** | | | | | | | | |
563
+ | SalamandraTA-7b-instruct | 35.20 | 53.40 | 61.58 | **0.89** | **0.86** | 0.78 | **0.96** | **0.81** |
564
+ | MADLAD400-7B | **35.73** | **51.87** | **63.46** | 0.88 | 0.85 | **0.79** | 1.16 | 1.10 |
565
+ | SalamandraTA-7b-base | 34.99 | 52.64 | 62.58 | 0.87 | 0.84 | 0.77 | 1.45 | 1.23 |
566
  | **XX-EN** | | | | | | | | |
567
+ | SalamandraTA-7b-instruct | **44.37** | **42.49** | 68.29 | **0.89** | **0.86** | **0.80** | **1.05** | **0.99** |
568
+ | MADLAD400-7B | 43.20 | 43.33 | 67.98 | **0.89** | **0.86** | **0.80** | 1.13 | 1.15 |
569
+ | SalamandraTA-7b-base | 44.12 | 43.00 | **68.43** | **0.89** | 0.85 | **0.80** | 1.13 | 1.22 |
570
+
571
+
572
+
573
 
574
 
575
  <img src="./images/bleu_en.png" alt="English" width="100%"/>
 
587
  | | Bleu↑ | Ter↓ | ChrF↑ | Comet↑ | Comet-kiwi↑ | Bleurt↑ | MetricX↓ | MetricX-QE↓ |
588
  |:---------------------------------|-------:|------:|-------:|--------:|-------------:|---------:|----------:|-------------:|
589
  | **ES-XX** | | | | | | | | |
590
+ | SalamandraTA-7b-instruct | **23.68** | **67.31** | **53.98** | **0.87** | **0.83** | **0.76** | **0.93** | **0.80** |
591
+ | MADLAD400-7B | 22.48 | 68.91 | 53.93 | 0.86 | **0.83** | 0.75 | 1.09 | 1.14 |
592
+ | SalamandraTA-7b-base | 21.63 | 70.08 | 52.98 | 0.86 | **0.83** | 0.74 | 1.24 | 1.12 |
593
  | **XX-ES** | | | | | | | | |
594
+ | SalamandraTA-7b-instruct | **26.40** | 62.27 | **53.54** | **0.85** | **0.84** | **0.74** | **0.80** | **1.07** |
595
+ | MADLAD400-7B | 24.85 | **61.82** | 53.00 | **0.85** | **0.84** | **0.74** | 1.05 | 1.50 |
596
+ | SalamandraTA-7b-base | 24.71 | 62.33 | 52.96 | **0.85** | **0.84** | 0.73 | 1.06 | 1.37 |
597
+
598
 
599
 
600
  <img src="./images/bleu_es.png" alt="English" width="100%"/>
 
614
  | | Bleu↑ | Ter↓ | ChrF↑ | Comet↑ | Comet-kiwi↑ | Bleurt↑ | MetricX↓ | MetricX-QE↓ |
615
  |:---------------------------------|-------:|------:|-------:|--------:|-------------:|---------:|----------:|-------------:|
616
  | **CA-XX** | | | | | | | | |
617
+ | SalamandraTA-7b-instruct | **29.50** | 59.26 | 58.21 | **0.88** | **0.81** | **0.77** | **0.97** | **0.98** |
618
+ | MADLAD400-7B | 29.37 | **59.01** | **58.47** | 0.87 | **0.81** | **0.77** | 1.08 | 1.31 |
619
+ | SalamandraTA-7b-base | 29.06 | 59.32 | 58.00 | 0.87 | **0.81** | 0.76 | 1.23 | 1.28 |
620
  | **XX-CA** | | | | | | | | |
621
+ | SalamandraTA-7b-instruct | **34.51** | **54.21** | **60.10** | **0.86** | **0.81** | **0.76** | **0.90** | **1.29** |
622
+ | MADLAD400-7B | 33.02 | 55.01 | 59.38 | **0.86** | **0.81** | 0.75 | 1.18 | 1.79 |
623
+ | SalamandraTA-7b-base | 32.75 | 55.78 | 59.42 | **0.86** | **0.81** | 0.75 | 1.17 | 1.63 |
624
+
625
 
626
 
627
  <img src="./images/bleu_ca.png" alt="English" width="100%"/>
 
639
  | | Bleu↑ | Ter↓ | ChrF↑ | Comet↑ | Comet-kiwi↑ | Bleurt↑ | MetricX↓ | MetricX-QE↓ |
640
  |:---------------------------------|-------:|------:|-------:|--------:|-------------:|---------:|----------:|-------------:|
641
  | **GL-XX** | | | | | | | | |
642
+ | SalamandraTA-7b-instruct | **36.95** | **50.12** | **62.55** | **0.88** | **0.85** | **0.77** | **0.86** | **0.98** |
643
+ | MADLAD400-7B | 26.43 | 64.30 | 55.99 | 0.86 | **0.85** | 0.76 | 1.35 | 2.06 |
644
+ | SalamandraTA-7b-base | 27.47 | 61.39 | 56.96 | 0.87 | 0.82 | 0.76 | 1.23 | 1.29 |
645
  | **XX-GL** | | | | | | | | |
646
+ | SalamandraTA-7b-instruct | **34.37** | **52.49** | **60.99** | **0.88** | **0.85** | **0.73** | **0.75** | **0.92** |
647
+ | MADLAD400-7B | 27.77 | 59.46 | 54.92 | 0.84 | **0.85** | 0.67 | 1.42 | 2.72 |
648
+ | SalamandraTA-7b-base | 28.22 | 59.52 | 56.28 | 0.85 | 0.82 | 0.69 | 1.27 | 1.78 |
649
 
650
  <img src="./images/bleu_gl.png" alt="English" width="100%"/>
651
 
 
663
  | | Bleu↑ | Ter↓ | ChrF↑ | Comet↑ | Comet-kiwi↑ | Bleurt↑ | MetricX↓ | MetricX-QE↓ |
664
  |:---------------------------------|-------:|------:|-------:|--------:|-------------:|---------:|----------:|-------------:|
665
  | **EU-XX** | | | | | | | | |
666
+ | SalamandraTA-7b-instruct | **29.89** | **58.54** | **56.66** | **0.87** | **0.85** | **0.76** | **0.90** | **0.89** |
667
+ | MADLAD400-7B | 21.26 | 69.75 | 49.80 | 0.85 | 0.82 | 0.72 | 1.54 | 2.71 |
668
+ | SalamandraTA-7b-base | 22.87 | 67.38 | 52.19 | 0.86 | 0.79 | 0.74 | 1.19 | 1.61 |
669
  | **XX-EU** | | | | | | | | |
670
+ | SalamandraTA-7b-instruct | **18.89** | **71.74** | **57.16** | **0.87** | **0.84** | **0.82** | **0.58** | **0.44** |
671
+ | MADLAD400-7B | 13.64 | 85.01 | 50.96 | 0.82 | 0.80 | 0.78 | 2.09 | 3.58 |
672
+ | SalamandraTA-7b-base | 17.01 | 75.92 | 55.22 | 0.85 | 0.77 | 0.80 | 1.04 | 1.17 |
673
 
674
 
675
  <img src="./images/bleu_eu.png" alt="English" width="100%"/>
 
687
 
688
  #### English-XX
689
 
690
+ | | source | target | Bleu | Ter | ChrF |
691
+ |:-------------------------|:---------|:---------|:----------|:----------|:----------|
692
+ | SalamandraTA-7b-instruct | en | ast | **31.79** | **54.07** | **61.78** |
693
+ | SalamandraTA-7b-base | en | ast | 26.40 | 64.02 | 57.35 |
694
+ | Transducens/IbRo-nllb | en | ast | 20.56 | 63.92 | 53.32 |
 
695
  | | | | | | |
696
+ | SalamandraTA-7b-instruct | en | arn | **22.77** | **66.06** | **52.61** |
697
+ | SalamandraTA-7b-base | en | arn | 14.13 | 74.05 | 46.17 |
698
+ | Transducens/IbRo-nllb | en | arn | 12.81 | 73.21 | 45.76 |
699
  | | | | | | |
700
+ | SalamandraTA-7b-instruct | en | arg | **19.74** | 71.58 | **51.08** |
701
+ | Transducens/IbRo-nllb | en | arg | 14.07 | **70.37** | 46.89 |
702
+ | SalamandraTA-7b-base | en | arg | 12.24 | 73.48 | 44.75 |
703
+
704
 
705
  </details>
706
 
 
710
 
711
  #### Spanish-XX
712
 
713
+ | | source | target | Bleu | Ter | ChrF |
714
+ |:-------------------------|:---------|:---------|:----------|:----------|:----------|
715
+ | SalamandraTA-7b-instruct | es | ast | **20.66** | **71.81** | **53.14** |
716
+ | SalamandraTA-7b-base | es | ast | 17.65 | 75.78 | 51.05 |
717
+ | Transducens/IbRo-nllb | es | ast | 16.79 | 76.36 | 50.89 |
 
 
718
  | | | | | | |
719
+ | SalamandraTA-7b-base | es | arn | **51.59** | **35.51** | **73.50** |
720
+ | Transducens/IbRo-nllb | es | arn | 50.20 | 36.60 | 73.16 |
721
+ | SalamandraTA-7b-instruct | es | arn | 47.37 | 39.29 | 70.65 |
 
722
  | | | | | | |
723
+ | Transducens/IbRo-nllb | es | arg | **59.75** | **28.01** | **78.73** |
724
+ | SalamandraTA-7b-base | es | arg | 53.96 | 31.51 | 76.08 |
725
+ | SalamandraTA-7b-instruct | es | arg | 44.10 | 39.98 | 71.12 |
 
726
 
727
  </details>
728
 
 
734
  #### Catalan-XX
735
 
736
 
737
+ | | source | target | Bleu | Ter | ChrF |
738
+ |:-------------------------|:---------|:---------|:----------|:----------|:----------|
739
+ | SalamandraTA-7b-instruct | ca | ast | **28.13** | **58.84** | **58.98** |
740
+ | SalamandraTA-7b-base | ca | ast | 26.11 | 63.63 | 58.08 |
741
+ | Transducens/IbRo-nllb | ca | ast | 24.77 | 61.60 | 57.49 |
 
 
742
  | | | | | | |
743
+ | SalamandraTA-7b-base | ca | arn | **31.76** | **53.71** | **60.71** |
744
+ | Transducens/IbRo-nllb | ca | arn | 31.22 | 54.30 | 60.30 |
745
+ | SalamandraTA-7b-instruct | ca | arn | 30.89 | 54.70 | 59.78 |
 
746
  | | | | | | |
747
+ | Transducens/IbRo-nllb | ca | arg | **24.44** | **60.79** | **55.51** |
748
+ | SalamandraTA-7b-base | ca | arg | 22.53 | 62.37 | 54.32 |
749
+ | SalamandraTA-7b-instruct | ca | arg | 20.96 | 65.64 | 52.41 |
750
+
751
 
752
  </details>
753