Plainly Optimized Network
Dataset: BIGBENCH
Trainer Hyperparameters:
lr= 5e-05per_device_batch_size= 1gradient_accumulation_steps= 4weight_decay= 1e-09seed= 42
| eval_loss | eval_mse | epoch |
|---|---|---|
| 58.741 | 0.055 | 1.0 |
| 60.624 | 0.058 | 2.0 |
| 60.765 | 0.057 | 3.0 |
| 55.858 | 0.051 | 4.0 |
| 57.271 | 0.053 | 5.0 |
| 56.004 | 0.051 | 6.0 |
| 60.246 | 0.056 | 7.0 |
| 55.218 | 0.049 | 8.0 |
| 55.261 | 0.049 | 9.0 |
| 54.730 | 0.049 | 10.0 |
| 58.137 | 0.052 | 11.0 |
| 53.927 | 0.048 | 12.0 |
| 56.143 | 0.051 | 13.0 |
| 54.604 | 0.049 | 14.0 |
| 53.596 | 0.048 | 15.0 |
| 54.241 | 0.049 | 16.0 |
| 55.500 | 0.050 | 17.0 |
| 53.256 | 0.047 | 18.0 |
| 53.139 | 0.047 | 19.0 |
- Downloads last month
- -
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support