t5_small_en-pt / README.md

rdsmaia

update model card README.md

e1a70e5 almost 2 years ago

preview code

raw

history blame

5.09 kB

metadata

license: apache-2.0
base_model: t5-small
tags:
  - generated_from_trainer
datasets:
  - opus_books
metrics:
  - bleu
model-index:
  - name: t5_small_en-pt
    results:
      - task:
          name: Sequence-to-sequence Language Modeling
          type: text2text-generation
        dataset:
          name: opus_books
          type: opus_books
          config: en-pt
          split: train
          args: en-pt
        metrics:
          - name: Bleu
            type: bleu
            value: 6.3078

t5_small_en-pt

This model is a fine-tuned version of t5-small on the opus_books dataset. It achieves the following results on the evaluation set:

Loss: 1.5396
Bleu: 6.3078
Gen Len: 17.9644

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0002
train_batch_size: 32
eval_batch_size: 16
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 50

Training results

Training Loss	Epoch	Step	Validation Loss	Bleu	Gen Len
No log	1.0	36	2.9730	1.4396	17.9786
No log	2.0	72	2.6798	1.7604	18.1495
No log	3.0	108	2.4812	1.8606	18.2954
No log	4.0	144	2.3366	2.1379	18.2028
No log	5.0	180	2.2165	2.3636	18.1922
No log	6.0	216	2.1192	2.7925	18.1815
No log	7.0	252	2.0400	3.2243	18.1957
No log	8.0	288	1.9703	3.6722	18.1886
No log	9.0	324	1.9227	3.6416	18.1851
No log	10.0	360	1.8762	3.9984	18.1851
No log	11.0	396	1.8303	4.0942	18.1103
No log	12.0	432	1.8044	4.4425	18.1388
No log	13.0	468	1.7719	4.3346	18.1423
2.281	14.0	504	1.7477	4.6716	18.1032
2.281	15.0	540	1.7256	4.7874	18.1139
2.281	16.0	576	1.7057	4.8878	18.0783
2.281	17.0	612	1.6871	4.8045	18.0819
2.281	18.0	648	1.6770	4.9783	18.0676
2.281	19.0	684	1.6542	5.1069	18.0107
2.281	20.0	720	1.6414	4.902	18.0569
2.281	21.0	756	1.6326	5.0385	18.0214
2.281	22.0	792	1.6228	5.1533	18.0534
2.281	23.0	828	1.6233	5.397	18.0285
2.281	24.0	864	1.6076	5.4458	18.0214
2.281	25.0	900	1.5995	5.5752	18.0712
2.281	26.0	936	1.5938	5.3835	18.0925
2.281	27.0	972	1.5863	5.6135	18.0107
1.3904	28.0	1008	1.5780	5.8076	18.0356
1.3904	29.0	1044	1.5757	5.8528	18.0641
1.3904	30.0	1080	1.5721	5.8875	18.0285
1.3904	31.0	1116	1.5648	6.1429	18.0498
1.3904	32.0	1152	1.5596	6.0269	18.0819
1.3904	33.0	1188	1.5592	6.2233	18.0427
1.3904	34.0	1224	1.5552	6.0874	18.0569
1.3904	35.0	1260	1.5542	6.2611	18.0463
1.3904	36.0	1296	1.5493	6.1328	18.0391
1.3904	37.0	1332	1.5509	6.2341	18.0356
1.3904	38.0	1368	1.5455	6.2754	18.0036
1.3904	39.0	1404	1.5468	6.2263	18.0071
1.3904	40.0	1440	1.5446	6.1178	17.9929
1.3904	41.0	1476	1.5436	6.3536	17.9964
1.1159	42.0	1512	1.5426	6.296	17.9715
1.1159	43.0	1548	1.5402	6.1919	18.0356
1.1159	44.0	1584	1.5386	6.2256	18.0356
1.1159	45.0	1620	1.5392	6.2119	18.0356
1.1159	46.0	1656	1.5404	6.3696	18.032
1.1159	47.0	1692	1.5390	6.3779	17.9964
1.1159	48.0	1728	1.5392	6.2079	18.0107
1.1159	49.0	1764	1.5396	6.3334	17.968
1.1159	50.0	1800	1.5396	6.3078	17.9644

Framework versions

Transformers 4.31.0
Pytorch 2.0.1+cu117
Datasets 2.13.1
Tokenizers 0.13.3