vit_focus_full

This model is a fine-tuned version of on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.0531
  • Mse: 0.1291
  • Mae: 0.3119

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 32
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • num_epochs: 30

Training results

Training Loss Epoch Step Validation Loss Mse Mae
0.3146 0.9855 51 0.0595 0.1403 0.3265
0.2488 1.9855 102 0.0566 0.1395 0.3253
0.2278 2.9855 153 0.0611 0.1426 0.3288
0.206 3.9855 204 0.0536 0.1323 0.3180
0.1902 4.9855 255 0.0619 0.1411 0.3271
0.187 5.9855 306 0.0508 0.1320 0.3169
0.1757 6.9855 357 0.0537 0.1339 0.3183
0.1523 7.9855 408 0.0558 0.1330 0.3168
0.1528 8.9855 459 0.0591 0.1381 0.3225
0.1416 9.9855 510 0.0536 0.1353 0.3198
0.1298 10.9855 561 0.0530 0.1325 0.3164
0.1161 11.9855 612 0.0511 0.1315 0.3156
0.1085 12.9855 663 0.0531 0.1385 0.3243
0.1028 13.9855 714 0.0530 0.1316 0.3151
0.0891 14.9855 765 0.0540 0.1338 0.3178
0.0878 15.9855 816 0.0536 0.1335 0.3177
0.077 16.9855 867 0.0534 0.1299 0.3132
0.0769 17.9855 918 0.0549 0.1313 0.3149
0.0663 18.9855 969 0.0531 0.1291 0.3119
0.064 19.9855 1020 0.0540 0.1352 0.3197
0.0608 20.9855 1071 0.0535 0.1334 0.3179
0.0548 21.9855 1122 0.0529 0.1299 0.3134
0.0517 22.9855 1173 0.0534 0.1310 0.3152
0.0498 23.9855 1224 0.0544 0.1314 0.3151
0.047 24.9855 1275 0.0531 0.1309 0.3145
0.0443 25.9855 1326 0.0537 0.1325 0.3164
0.042 26.9855 1377 0.0533 0.1319 0.3156
0.0397 27.9855 1428 0.0530 0.1317 0.3155
0.0411 28.9855 1479 0.0542 0.1328 0.3167
0.0382 29.9855 1530 0.0533 0.1327 0.3166

Framework versions

  • Transformers 4.51.3
  • Pytorch 2.7.0
  • Datasets 3.5.1
  • Tokenizers 0.21.1
Downloads last month
1
Safetensors
Model size
24.3M params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support