rut5-base-absum-finetuned-summ

This model is a fine-tuned version of cointegrated/rut5-base-absum on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.6039
  • Rouge1: 97.0122
  • Rouge2: 94.5148
  • Rougel: 97.0189
  • Rougelsum: 96.9668

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5.6e-05
  • train_batch_size: 64
  • eval_batch_size: 64
  • seed: 42
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • num_epochs: 8

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum
No log 1.0 15 0.9642 88.9988 70.6048 88.8684 88.9803
No log 2.0 30 0.7765 94.6938 86.9198 94.7219 94.6778
No log 3.0 45 0.6995 96.1002 90.9986 96.058 96.058
No log 4.0 60 0.6596 96.0421 92.2644 96.067 96.0107
No log 5.0 75 0.6294 96.5868 93.2489 96.5836 96.5625
No log 6.0 90 0.6172 96.4605 92.827 96.4538 96.4071
No log 7.0 105 0.6091 97.0122 94.5148 97.0189 96.9668
1.0079 8.0 120 0.6039 97.0122 94.5148 97.0189 96.9668

Framework versions

  • Transformers 4.53.0
  • Pytorch 2.2.1+cu118
  • Datasets 4.0.0
  • Tokenizers 0.21.4
Downloads last month
15
Safetensors
Model size
244M params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for Laterr/rut5-base-absum-finetuned-summ

Finetuned
(3)
this model