model_v3_v2 / README.md
sehilnlf's picture
version 2
7bb5d7e verified
|
raw
history blame
3.93 kB
metadata
license: apache-2.0
base_model: facebook/bart-large
tags:
  - text2text-generation
  - generated_from_trainer
metrics:
  - sacrebleu
model-index:
  - name: model_v3_v2
    results: []

model_v3_v2

This model is a fine-tuned version of facebook/bart-large on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 1.1977
  • Sacrebleu: 66.7256

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 32
  • eval_batch_size: 32
  • seed: 42
  • gradient_accumulation_steps: 8
  • total_train_batch_size: 256
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 40
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Sacrebleu
No log 0.99 54 0.5648 65.7974
No log 1.99 109 0.6224 66.8854
No log 3.0 164 0.6639 66.8333
No log 4.0 219 0.5929 66.7857
No log 4.99 273 0.6427 65.8395
No log 5.99 328 0.6721 66.4172
No log 7.0 383 0.7511 66.4660
No log 8.0 438 0.7662 66.6480
No log 8.99 492 0.7588 66.5092
No log 9.99 547 0.7916 66.5144
No log 11.0 602 0.8172 66.6279
No log 12.0 657 0.8350 66.5607
No log 12.99 711 0.8809 66.6095
No log 13.99 766 0.8843 66.4089
No log 15.0 821 1.0130 66.5184
No log 16.0 876 0.9180 66.4269
No log 16.99 930 0.9794 66.5766
No log 17.99 985 0.9450 66.6713
No log 19.0 1040 0.9880 66.7081
No log 20.0 1095 0.9540 66.4440
No log 20.99 1149 1.0552 66.5390
No log 21.99 1204 0.9806 66.5975
No log 23.0 1259 1.0528 66.6404
No log 24.0 1314 1.0348 66.4127
No log 24.99 1368 1.0758 66.6139
No log 25.99 1423 1.1291 66.6778
No log 27.0 1478 1.1112 66.6411
No log 28.0 1533 1.1305 66.5986
No log 28.99 1587 1.1532 66.5047
No log 29.99 1642 1.1106 66.5662
No log 31.0 1697 1.2084 66.6593
No log 32.0 1752 1.1438 66.6117
No log 32.99 1806 1.1956 66.6758
No log 33.99 1861 1.1630 66.7359
No log 35.0 1916 1.1570 66.6989
No log 36.0 1971 1.1754 66.6495
No log 36.99 2025 1.2456 66.7018
No log 37.99 2080 1.2197 66.7990
No log 39.0 2135 1.1886 66.7049
No log 39.45 2160 1.1977 66.7256

Framework versions

  • Transformers 4.39.3
  • Pytorch 2.1.2
  • Datasets 2.18.0
  • Tokenizers 0.15.2