t5-v1_1-base-gramatika-final-e8-b16
This model is a fine-tuned version of google/t5-v1_1-base on the None dataset. It achieves the following results on the evaluation set:
- Loss: 0.1723
- Rouge1: 43.8331
- Rouge2: 34.7609
- Rougel: 43.5803
- Rougelsum: 43.5467
- Gen Len: 18.9287
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.001
- train_batch_size: 16
- eval_batch_size: 16
- seed: 42
- optimizer: Adafactor
- lr_scheduler_type: linear
- num_epochs: 8
Training results
Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Gen Len |
---|---|---|---|---|---|---|---|---|
2.6434 | 0.37 | 300 | 0.4530 | 38.4418 | 26.1528 | 37.8295 | 37.7894 | 18.9434 |
0.5551 | 0.73 | 600 | 0.3368 | 39.883 | 28.2471 | 39.2883 | 39.2822 | 18.9345 |
0.4523 | 1.1 | 900 | 0.2959 | 40.3084 | 29.2298 | 39.8742 | 39.8747 | 18.9350 |
0.4165 | 1.46 | 1200 | 0.2610 | 41.0422 | 30.4902 | 40.6542 | 40.6354 | 18.9350 |
0.3196 | 1.83 | 1500 | 0.2292 | 41.6111 | 31.1549 | 41.2572 | 41.2477 | 18.9355 |
0.2718 | 2.2 | 1800 | 0.2153 | 41.9295 | 31.6902 | 41.5757 | 41.5624 | 18.9334 |
0.2446 | 2.56 | 2100 | 0.2055 | 42.2918 | 32.4861 | 42.0541 | 42.0135 | 18.9324 |
0.2301 | 2.93 | 2400 | 0.2232 | 42.6172 | 33.0243 | 42.3474 | 42.3224 | 18.9334 |
0.1997 | 3.29 | 2700 | 0.1859 | 42.8442 | 33.4479 | 42.6294 | 42.6121 | 18.9350 |
0.186 | 3.66 | 3000 | 0.1816 | 42.9407 | 33.5872 | 42.7248 | 42.7125 | 18.9277 |
0.1736 | 4.02 | 3300 | 0.1771 | 43.1994 | 34.0513 | 43.0334 | 42.9982 | 18.9308 |
0.1439 | 4.39 | 3600 | 0.1818 | 43.2146 | 33.997 | 43.0221 | 42.9893 | 18.9282 |
0.1429 | 4.76 | 3900 | 0.1732 | 43.4458 | 34.377 | 43.3072 | 43.26 | 18.9277 |
0.132 | 5.12 | 4200 | 0.1795 | 43.7156 | 34.6069 | 43.4982 | 43.481 | 18.9292 |
0.1151 | 5.49 | 4500 | 0.1767 | 43.7618 | 34.7345 | 43.5565 | 43.5181 | 18.9287 |
0.1127 | 5.85 | 4800 | 0.1723 | 43.8331 | 34.7609 | 43.5803 | 43.5467 | 18.9287 |
0.0994 | 6.22 | 5100 | 0.1757 | 43.8866 | 34.9216 | 43.641 | 43.6214 | 18.9287 |
0.0892 | 6.59 | 5400 | 0.1779 | 43.9415 | 34.9905 | 43.7332 | 43.7063 | 18.9292 |
0.0914 | 6.95 | 5700 | 0.1725 | 43.9439 | 35.0456 | 43.7419 | 43.7266 | 18.9298 |
0.0772 | 7.32 | 6000 | 0.1776 | 44.1132 | 35.3173 | 43.9301 | 43.9135 | 18.9287 |
0.0755 | 7.68 | 6300 | 0.1778 | 44.0494 | 35.3179 | 43.8797 | 43.8587 | 18.9282 |
Framework versions
- Transformers 4.30.1
- Pytorch 1.11.0a0+b6df043
- Datasets 2.12.0
- Tokenizers 0.13.3
- Downloads last month
- 1
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support