Plainly Optimized Network

Dataset: SUPERGLUE

Trainer Hyperparameters:

  • lr = 5e-05
  • per_device_batch_size = 8
  • gradient_accumulation_steps = 2
  • weight_decay = 1e-09
  • seed = 42
eval_loss eval_accuracy epoch
20.646 0.587 1.0
20.427 0.609 2.0
20.235 0.580 3.0
19.325 0.623 4.0
19.596 0.623 5.0
18.737 0.717 6.0
18.698 0.717 7.0
18.311 0.725 8.0
18.347 0.739 9.0
18.437 0.710 10.0
17.924 0.717 11.0
18.021 0.732 12.0
17.819 0.739 13.0
18.181 0.725 14.0
18.062 0.732 15.0
18.068 0.725 16.0
18.177 0.717 17.0
17.865 0.732 18.0
17.977 0.732 19.0
Downloads last month
3
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support