metadata

language:
  - ko
  - zh
base_model: facebook/mbart-large-50-many-to-many-mmt
tags:
  - generated_from_trainer
metrics:
  - bleu
model-index:
  - name: ko-zh
    results: []

ko-zh

This model is a fine-tuned version of facebook/mbart-large-50-many-to-many-mmt on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 1.2511
Bleu: 14.6038
Gen Len: 15.5513

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 4
eval_batch_size: 4
seed: 42
distributed_type: multi-GPU
num_devices: 4
gradient_accumulation_steps: 4
total_train_batch_size: 64
total_eval_batch_size: 16
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 1000
num_epochs: 40

Training results

Training Loss	Epoch	Step	Validation Loss	Bleu	Gen Len
1.3824	0.8	1500	1.3011	11.4445	15.774
1.0646	1.61	3000	1.1916	13.1811	15.6757
0.8071	2.41	4500	1.1864	14.1901	15.2832
0.6496	3.22	6000	1.1979	14.3496	15.5238
0.6365	4.02	7500	1.2511	14.6014	15.5634
0.4942	4.82	9000	1.2521	14.3411	15.4888
0.3632	5.63	10500	1.3326	14.204	15.4075
0.2601	6.43	12000	1.4028	14.1714	15.4783
0.1919	7.23	13500	1.4764	13.9406	15.4543

Framework versions

Transformers 4.34.0
Pytorch 2.1.0+cu121
Datasets 2.14.5
Tokenizers 0.14.1