sft-base_loss-t5-v1_1-base-mle0-ul0-tox0-e4
This model is a fine-tuned version of google/t5-v1_1-base on the None dataset. It achieves the following results on the evaluation set:
- Loss: 1.0387
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 4
- eval_batch_size: 16
- seed: 42
- gradient_accumulation_steps: 2
- total_train_batch_size: 8
- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: constant
- lr_scheduler_warmup_steps: 5
- num_epochs: 10
Training results
Training Loss | Epoch | Step | Validation Loss |
---|---|---|---|
6.89 | 0.2899 | 200 | 3.1273 |
6.8047 | 0.5797 | 400 | 3.4536 |
4.0718 | 0.8696 | 600 | 1.7809 |
2.9926 | 1.1594 | 800 | 1.5160 |
2.4307 | 1.4493 | 1000 | 1.3290 |
1.9824 | 1.7391 | 1200 | 1.2237 |
1.8589 | 2.0290 | 1400 | 1.1363 |
1.7644 | 2.3188 | 1600 | 1.1028 |
1.5996 | 2.6087 | 1800 | 1.0860 |
1.4636 | 2.8986 | 2000 | 1.0699 |
1.3986 | 3.1884 | 2200 | 1.0776 |
1.3767 | 3.4783 | 2400 | 1.0204 |
1.3042 | 3.7681 | 2600 | 1.0475 |
1.3342 | 4.0580 | 2800 | 1.0547 |
1.2306 | 4.3478 | 3000 | 1.0423 |
1.2201 | 4.6377 | 3200 | 1.0424 |
1.2224 | 4.9275 | 3400 | 1.0388 |
1.205 | 5.2174 | 3600 | 1.0178 |
1.0739 | 5.5072 | 3800 | 1.0303 |
1.0681 | 5.7971 | 4000 | 1.0307 |
1.0863 | 6.0870 | 4200 | 1.0071 |
1.0393 | 6.3768 | 4400 | 1.0509 |
1.0076 | 6.6667 | 4600 | 1.0143 |
1.0255 | 6.9565 | 4800 | 1.0196 |
0.9258 | 7.2464 | 5000 | 1.0367 |
0.9698 | 7.5362 | 5200 | 1.0203 |
0.978 | 7.8261 | 5400 | 1.0055 |
0.9228 | 8.1159 | 5600 | 1.0372 |
0.9173 | 8.4058 | 5800 | 1.0240 |
0.8497 | 8.6957 | 6000 | 1.0433 |
0.8383 | 8.9855 | 6200 | 1.0269 |
0.8392 | 9.2754 | 6400 | 1.0480 |
0.8204 | 9.5652 | 6600 | 1.0442 |
0.8157 | 9.8551 | 6800 | 1.0387 |
Framework versions
- Transformers 4.51.3
- Pytorch 2.6.0+cu124
- Datasets 3.5.0
- Tokenizers 0.21.1
- Downloads last month
- 4,616
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support
Model tree for TarhanE/sft-base_loss-t5-v1_1-base-mle0-ul0-tox0-e4
Base model
google/t5-v1_1-base