t5-v1_1-large-gramatika-final-e8-b16
This model is a fine-tuned version of google/t5-v1_1-large on the None dataset. It achieves the following results on the evaluation set:
- Loss: 0.1652
- Rouge1: 43.5873
- Rouge2: 34.5612
- Rougel: 43.3549
- Rougelsum: 43.3701
- Gen Len: 18.9261
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.001
- train_batch_size: 16
- eval_batch_size: 16
- seed: 42
- optimizer: Adafactor
- lr_scheduler_type: linear
- num_epochs: 8
Training results
Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Gen Len |
---|---|---|---|---|---|---|---|---|
1.021 | 0.37 | 300 | 0.3972 | 34.0058 | 21.9508 | 32.8363 | 32.8413 | 18.7689 |
0.4571 | 0.73 | 600 | 0.3349 | 33.8082 | 23.1941 | 33.2259 | 33.2092 | 18.9198 |
0.3649 | 1.1 | 900 | 0.2819 | 37.177 | 26.2961 | 36.1081 | 36.1 | 18.9251 |
0.2812 | 1.46 | 1200 | 0.2162 | 41.57 | 31.5746 | 41.2196 | 41.2329 | 18.9303 |
0.2348 | 1.83 | 1500 | 0.1934 | 42.498 | 32.9335 | 42.2362 | 42.2357 | 18.9313 |
0.1883 | 2.2 | 1800 | 0.1900 | 42.8109 | 33.4358 | 42.5931 | 42.6018 | 18.9308 |
0.1607 | 2.56 | 2100 | 0.1790 | 42.8463 | 33.5315 | 42.6271 | 42.6405 | 18.9329 |
0.1553 | 2.93 | 2400 | 0.1764 | 43.0997 | 34.0805 | 42.9099 | 42.9121 | 18.9313 |
0.1143 | 3.29 | 2700 | 0.1735 | 43.2635 | 34.1534 | 43.0452 | 43.0553 | 18.9334 |
0.1078 | 3.66 | 3000 | 0.1652 | 43.5873 | 34.5612 | 43.3549 | 43.3701 | 18.9261 |
0.1029 | 4.02 | 3300 | 0.1729 | 43.7576 | 35.0653 | 43.5811 | 43.5982 | 18.9292 |
0.0669 | 4.39 | 3600 | 0.1773 | 43.6613 | 34.8016 | 43.4742 | 43.4793 | 18.9266 |
0.0671 | 4.76 | 3900 | 0.1710 | 43.92 | 35.1941 | 43.7078 | 43.7307 | 18.9313 |
0.0584 | 5.12 | 4200 | 0.1883 | 44.0979 | 35.4031 | 43.9062 | 43.9378 | 18.9334 |
0.0408 | 5.49 | 4500 | 0.1946 | 44.1764 | 35.6678 | 44.0201 | 44.0305 | 18.9319 |
0.0405 | 5.85 | 4800 | 0.1901 | 44.2228 | 35.7124 | 44.0157 | 44.0352 | 18.9319 |
0.0311 | 6.22 | 5100 | 0.2098 | 44.4711 | 36.1284 | 44.3122 | 44.3301 | 18.9251 |
0.0239 | 6.59 | 5400 | 0.2087 | 44.4937 | 36.2131 | 44.3726 | 44.3922 | 18.9271 |
0.0241 | 6.95 | 5700 | 0.2111 | 44.48 | 36.0938 | 44.3196 | 44.3404 | 18.9256 |
0.0162 | 7.32 | 6000 | 0.2235 | 44.4869 | 36.1867 | 44.3554 | 44.3615 | 18.9251 |
0.0147 | 7.68 | 6300 | 0.2281 | 44.618 | 36.3734 | 44.5021 | 44.5151 | 18.9240 |
Framework versions
- Transformers 4.30.1
- Pytorch 1.11.0a0+b6df043
- Datasets 2.12.0
- Tokenizers 0.13.3
- Downloads last month
- 2
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support