Hyperparameters:

  • learning rate: 2e-5
  • weight decay: 0.01
  • per_device_train_batch_size: 8
  • per_device_eval_batch_size: 8
  • gradient_accumulation_steps:1
  • eval steps: 6000
  • max_length: 512
  • num_epochs: 2

Dataset version:

  • “craffel/tasky_or_not”, “10xp3_10xc4”, “15f88c8”

Checkpoint:

  • 48000 steps

Results on Validation set:

Step Training Loss Validation Loss Accuracy Precision Recall F1
6000 0.031900 0.163412 0.982194 0.999211 0.980462 0.989748
12000 0.014700 0.106132 0.976666 0.999639 0.973733 0.986516
18000 0.010700 0.043012 0.995743 0.999223 0.995918 0.997568
24000 0.007400 0.095047 0.984724 0.999857 0.982714 0.991211
30000 0.004100 0.087274 0.990400 0.999829 0.989217 0.994495
36000 0.003100 0.162909 0.981972 1.000000 0.979434 0.989610
42000 0.002200 0.148721 0.980454 0.999986 0.977717 0.988726
48000 0.001000 0.094455 0.990437 0.999943 0.989147 0.994516
Downloads last month
3
Inference Providers NEW
This model is not currently available via any of the supported third-party Inference Providers, and the model is not deployed on the HF Inference API.

Dataset used to train taskydata/deberta-v3-base_10xp3_10xc4_512