contemmcm's picture
End of training
959e608 verified
metadata
library_name: transformers
license: apache-2.0
base_model: google-t5/t5-base
tags:
  - generated_from_trainer
metrics:
  - bleu
model-index:
  - name: 400c5511b6dea2d366b982d82c7f2f47
    results: []

400c5511b6dea2d366b982d82c7f2f47

This model is a fine-tuned version of google-t5/t5-base on the Helsinki-NLP/opus_books [fr-it] dataset. It achieves the following results on the evaluation set:

  • Loss: 1.7447
  • Data Size: 1.0
  • Epoch Runtime: 89.7493
  • Bleu: 8.7516

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 4
  • total_train_batch_size: 32
  • total_eval_batch_size: 32
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: constant
  • num_epochs: 50

Training results

Training Loss Epoch Step Validation Loss Data Size Epoch Runtime Bleu
No log 0 0 4.5136 0 7.0465 0.4966
No log 1 367 4.1107 0.0078 8.0846 0.7376
No log 2 734 3.5075 0.0156 8.7588 0.9149
No log 3 1101 3.1627 0.0312 10.9472 1.1631
No log 4 1468 3.0039 0.0625 12.4164 1.5942
0.1766 5 1835 2.8927 0.125 17.6485 2.0877
3.0996 6 2202 2.7307 0.25 27.5012 2.5708
2.8871 7 2569 2.5677 0.5 47.1074 4.1272
2.6588 8.0 2936 2.3864 1.0 92.7098 5.1262
2.4844 9.0 3303 2.2674 1.0 92.5091 5.4158
2.3644 10.0 3670 2.1871 1.0 88.8597 5.8018
2.2802 11.0 4037 2.1226 1.0 87.4977 6.1405
2.2119 12.0 4404 2.0682 1.0 88.0479 6.4334
2.1701 13.0 4771 2.0277 1.0 85.0628 6.5505
2.077 14.0 5138 1.9844 1.0 87.9358 6.8263
2.0346 15.0 5505 1.9514 1.0 86.9127 7.0307
1.9813 16.0 5872 1.9358 1.0 86.8131 7.2225
1.9399 17.0 6239 1.9115 1.0 87.6552 7.4084
1.8997 18.0 6606 1.8905 1.0 91.3228 7.6722
1.8824 19.0 6973 1.8781 1.0 90.9620 7.5620
1.8423 20.0 7340 1.8488 1.0 88.2698 7.7587
1.7856 21.0 7707 1.8354 1.0 88.6626 7.8544
1.7663 22.0 8074 1.8322 1.0 89.3511 7.7950
1.7213 23.0 8441 1.8203 1.0 87.7341 7.9426
1.6967 24.0 8808 1.8010 1.0 90.4521 8.0048
1.6952 25.0 9175 1.7929 1.0 91.0979 7.9804
1.6379 26.0 9542 1.7913 1.0 91.5110 7.8794
1.6116 27.0 9909 1.7820 1.0 86.8504 8.0096
1.583 28.0 10276 1.7778 1.0 85.1309 8.2697
1.5875 29.0 10643 1.7725 1.0 91.7085 8.2960
1.5704 30.0 11010 1.7684 1.0 92.7509 8.1364
1.5249 31.0 11377 1.7576 1.0 90.7023 8.2841
1.5196 32.0 11744 1.7593 1.0 94.4428 8.3993
1.471 33.0 12111 1.7525 1.0 91.7521 8.3661
1.4632 34.0 12478 1.7573 1.0 90.2000 8.3085
1.4363 35.0 12845 1.7428 1.0 89.6053 8.6387
1.4309 36.0 13212 1.7485 1.0 95.4347 8.8372
1.4165 37.0 13579 1.7513 1.0 84.3171 8.7722
1.4042 38.0 13946 1.7471 1.0 87.1024 8.8673
1.3828 39.0 14313 1.7447 1.0 89.7493 8.7516

Framework versions

  • Transformers 4.57.0
  • Pytorch 2.8.0+cu128
  • Datasets 4.2.0
  • Tokenizers 0.22.1