Plainly Optimized Network
Dataset: SUPERGLUE
Trainer Hyperparameters:
lr
= 5e-05per_device_batch_size
= 8gradient_accumulation_steps
= 2weight_decay
= 1e-09seed
= 42
eval_loss | eval_accuracy | epoch |
---|---|---|
20.646 | 0.587 | 1.0 |
20.427 | 0.609 | 2.0 |
20.235 | 0.580 | 3.0 |
19.325 | 0.623 | 4.0 |
19.596 | 0.623 | 5.0 |
18.737 | 0.717 | 6.0 |
18.698 | 0.717 | 7.0 |
18.311 | 0.725 | 8.0 |
18.347 | 0.739 | 9.0 |
18.437 | 0.710 | 10.0 |
17.924 | 0.717 | 11.0 |
18.021 | 0.732 | 12.0 |
17.819 | 0.739 | 13.0 |
18.181 | 0.725 | 14.0 |
18.062 | 0.732 | 15.0 |
18.068 | 0.725 | 16.0 |
18.177 | 0.717 | 17.0 |
17.865 | 0.732 | 18.0 |
17.977 | 0.732 | 19.0 |
- Downloads last month
- 3
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support