t5_small_en-pt / README.md
rdsmaia's picture
update model card README.md
e1a70e5
|
raw
history blame
5.09 kB
metadata
license: apache-2.0
base_model: t5-small
tags:
  - generated_from_trainer
datasets:
  - opus_books
metrics:
  - bleu
model-index:
  - name: t5_small_en-pt
    results:
      - task:
          name: Sequence-to-sequence Language Modeling
          type: text2text-generation
        dataset:
          name: opus_books
          type: opus_books
          config: en-pt
          split: train
          args: en-pt
        metrics:
          - name: Bleu
            type: bleu
            value: 6.3078

t5_small_en-pt

This model is a fine-tuned version of t5-small on the opus_books dataset. It achieves the following results on the evaluation set:

  • Loss: 1.5396
  • Bleu: 6.3078
  • Gen Len: 17.9644

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0002
  • train_batch_size: 32
  • eval_batch_size: 16
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 50

Training results

Training Loss Epoch Step Validation Loss Bleu Gen Len
No log 1.0 36 2.9730 1.4396 17.9786
No log 2.0 72 2.6798 1.7604 18.1495
No log 3.0 108 2.4812 1.8606 18.2954
No log 4.0 144 2.3366 2.1379 18.2028
No log 5.0 180 2.2165 2.3636 18.1922
No log 6.0 216 2.1192 2.7925 18.1815
No log 7.0 252 2.0400 3.2243 18.1957
No log 8.0 288 1.9703 3.6722 18.1886
No log 9.0 324 1.9227 3.6416 18.1851
No log 10.0 360 1.8762 3.9984 18.1851
No log 11.0 396 1.8303 4.0942 18.1103
No log 12.0 432 1.8044 4.4425 18.1388
No log 13.0 468 1.7719 4.3346 18.1423
2.281 14.0 504 1.7477 4.6716 18.1032
2.281 15.0 540 1.7256 4.7874 18.1139
2.281 16.0 576 1.7057 4.8878 18.0783
2.281 17.0 612 1.6871 4.8045 18.0819
2.281 18.0 648 1.6770 4.9783 18.0676
2.281 19.0 684 1.6542 5.1069 18.0107
2.281 20.0 720 1.6414 4.902 18.0569
2.281 21.0 756 1.6326 5.0385 18.0214
2.281 22.0 792 1.6228 5.1533 18.0534
2.281 23.0 828 1.6233 5.397 18.0285
2.281 24.0 864 1.6076 5.4458 18.0214
2.281 25.0 900 1.5995 5.5752 18.0712
2.281 26.0 936 1.5938 5.3835 18.0925
2.281 27.0 972 1.5863 5.6135 18.0107
1.3904 28.0 1008 1.5780 5.8076 18.0356
1.3904 29.0 1044 1.5757 5.8528 18.0641
1.3904 30.0 1080 1.5721 5.8875 18.0285
1.3904 31.0 1116 1.5648 6.1429 18.0498
1.3904 32.0 1152 1.5596 6.0269 18.0819
1.3904 33.0 1188 1.5592 6.2233 18.0427
1.3904 34.0 1224 1.5552 6.0874 18.0569
1.3904 35.0 1260 1.5542 6.2611 18.0463
1.3904 36.0 1296 1.5493 6.1328 18.0391
1.3904 37.0 1332 1.5509 6.2341 18.0356
1.3904 38.0 1368 1.5455 6.2754 18.0036
1.3904 39.0 1404 1.5468 6.2263 18.0071
1.3904 40.0 1440 1.5446 6.1178 17.9929
1.3904 41.0 1476 1.5436 6.3536 17.9964
1.1159 42.0 1512 1.5426 6.296 17.9715
1.1159 43.0 1548 1.5402 6.1919 18.0356
1.1159 44.0 1584 1.5386 6.2256 18.0356
1.1159 45.0 1620 1.5392 6.2119 18.0356
1.1159 46.0 1656 1.5404 6.3696 18.032
1.1159 47.0 1692 1.5390 6.3779 17.9964
1.1159 48.0 1728 1.5392 6.2079 18.0107
1.1159 49.0 1764 1.5396 6.3334 17.968
1.1159 50.0 1800 1.5396 6.3078 17.9644

Framework versions

  • Transformers 4.31.0
  • Pytorch 2.0.1+cu117
  • Datasets 2.13.1
  • Tokenizers 0.13.3