400c5511b6dea2d366b982d82c7f2f47

This model is a fine-tuned version of google-t5/t5-base on the Helsinki-NLP/opus_books [fr-it] dataset. It achieves the following results on the evaluation set:

  • Loss: 1.7447
  • Data Size: 1.0
  • Epoch Runtime: 89.7493
  • Bleu: 8.7516

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 4
  • total_train_batch_size: 32
  • total_eval_batch_size: 32
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: constant
  • num_epochs: 50

Training results

Training Loss Epoch Step Validation Loss Data Size Epoch Runtime Bleu
No log 0 0 4.5136 0 7.0465 0.4966
No log 1 367 4.1107 0.0078 8.0846 0.7376
No log 2 734 3.5075 0.0156 8.7588 0.9149
No log 3 1101 3.1627 0.0312 10.9472 1.1631
No log 4 1468 3.0039 0.0625 12.4164 1.5942
0.1766 5 1835 2.8927 0.125 17.6485 2.0877
3.0996 6 2202 2.7307 0.25 27.5012 2.5708
2.8871 7 2569 2.5677 0.5 47.1074 4.1272
2.6588 8.0 2936 2.3864 1.0 92.7098 5.1262
2.4844 9.0 3303 2.2674 1.0 92.5091 5.4158
2.3644 10.0 3670 2.1871 1.0 88.8597 5.8018
2.2802 11.0 4037 2.1226 1.0 87.4977 6.1405
2.2119 12.0 4404 2.0682 1.0 88.0479 6.4334
2.1701 13.0 4771 2.0277 1.0 85.0628 6.5505
2.077 14.0 5138 1.9844 1.0 87.9358 6.8263
2.0346 15.0 5505 1.9514 1.0 86.9127 7.0307
1.9813 16.0 5872 1.9358 1.0 86.8131 7.2225
1.9399 17.0 6239 1.9115 1.0 87.6552 7.4084
1.8997 18.0 6606 1.8905 1.0 91.3228 7.6722
1.8824 19.0 6973 1.8781 1.0 90.9620 7.5620
1.8423 20.0 7340 1.8488 1.0 88.2698 7.7587
1.7856 21.0 7707 1.8354 1.0 88.6626 7.8544
1.7663 22.0 8074 1.8322 1.0 89.3511 7.7950
1.7213 23.0 8441 1.8203 1.0 87.7341 7.9426
1.6967 24.0 8808 1.8010 1.0 90.4521 8.0048
1.6952 25.0 9175 1.7929 1.0 91.0979 7.9804
1.6379 26.0 9542 1.7913 1.0 91.5110 7.8794
1.6116 27.0 9909 1.7820 1.0 86.8504 8.0096
1.583 28.0 10276 1.7778 1.0 85.1309 8.2697
1.5875 29.0 10643 1.7725 1.0 91.7085 8.2960
1.5704 30.0 11010 1.7684 1.0 92.7509 8.1364
1.5249 31.0 11377 1.7576 1.0 90.7023 8.2841
1.5196 32.0 11744 1.7593 1.0 94.4428 8.3993
1.471 33.0 12111 1.7525 1.0 91.7521 8.3661
1.4632 34.0 12478 1.7573 1.0 90.2000 8.3085
1.4363 35.0 12845 1.7428 1.0 89.6053 8.6387
1.4309 36.0 13212 1.7485 1.0 95.4347 8.8372
1.4165 37.0 13579 1.7513 1.0 84.3171 8.7722
1.4042 38.0 13946 1.7471 1.0 87.1024 8.8673
1.3828 39.0 14313 1.7447 1.0 89.7493 8.7516

Framework versions

  • Transformers 4.57.0
  • Pytorch 2.8.0+cu128
  • Datasets 4.2.0
  • Tokenizers 0.22.1
Downloads last month
-
Safetensors
Model size
0.3B params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for contemmcm/400c5511b6dea2d366b982d82c7f2f47

Base model

google-t5/t5-base
Finetuned
(688)
this model