2025-06-01_23-21-32

This model is a fine-tuned version of nvidia/mit-b0 on the generator dataset. It achieves the following results on the evaluation set:

  • Loss: 0.1614
  • Mean Iou: 0.5176
  • Mean Accuracy: 0.8876
  • Overall Accuracy: 0.9743
  • Per Category Iou: [0.9742468410406894, 0.06089158226797277]
  • Per Category Accuracy: [0.9746525149724692, 0.8004728992360859]

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 6e-05
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • num_epochs: 50

Training results

Training Loss Epoch Step Validation Loss Mean Iou Mean Accuracy Overall Accuracy Per Category Iou Per Category Accuracy
0.6021 3.3333 20 0.5909 0.4687 0.8817 0.9168 [0.9166466186982658, 0.020747239422429665] [0.9169402794152367, 0.8464896325936704]
0.483 6.6667 40 0.4743 0.4702 0.8926 0.9188 [0.9187025683078979, 0.02174875913808613] [0.9189588772375197, 0.8663150236449618]
0.4137 10.0 60 0.3513 0.4922 0.8785 0.9513 [0.9512098322489387, 0.03328872784134478] [0.951596165043716, 0.8053837759185158]
0.3383 13.3333 80 0.2649 0.5071 0.8518 0.9683 [0.9682495182020971, 0.04604880717631906] [0.9687853719602414, 0.7348126591487814]
0.2816 16.6667 100 0.2483 0.5149 0.8465 0.9745 [0.974434504920804, 0.05532738053519113] [0.9750081799140786, 0.7178974172426337]
0.2936 20.0 120 0.2346 0.5096 0.8790 0.9690 [0.9689592451845151, 0.05033140256996599] [0.9693866241133998, 0.7886504183339396]
0.2461 23.3333 140 0.1998 0.5145 0.8673 0.9732 [0.9731369716247265, 0.0557941058807841] [0.9736223392504542, 0.7610040014550745]
0.2397 26.6667 160 0.1990 0.5092 0.8959 0.9679 [0.9677955220165496, 0.050647636518198695] [0.968151855644824, 0.8235722080756639]
0.219 30.0 180 0.1780 0.5142 0.8876 0.9720 [0.9719682833343325, 0.056377244744169414] [0.9723682122845229, 0.8028373954165151]
0.1903 33.3333 200 0.2008 0.5094 0.9029 0.9677 [0.9676405121966583, 0.051242685179004516] [0.9679681397091366, 0.8377591851582393]
0.204 36.6667 220 0.1632 0.5199 0.8697 0.9765 [0.9764641430635311, 0.06329649091018905] [0.9769482050118011, 0.7624590760276464]
0.27 40.0 240 0.1662 0.5158 0.8880 0.9731 [0.9730663517970184, 0.05851614101169792] [0.9734674712716104, 0.8024736267733721]
0.1853 43.3333 260 0.1612 0.5195 0.8778 0.9759 [0.9758665197616132, 0.06309928858645221] [0.9763162070099016, 0.7791924336122227]
0.2103 46.6667 280 0.1644 0.5161 0.8921 0.9731 [0.9730533605611293, 0.05906435072936126] [0.9734374845796283, 0.8108403055656602]
0.2376 50.0 300 0.1614 0.5176 0.8876 0.9743 [0.9742468410406894, 0.06089158226797277] [0.9746525149724692, 0.8004728992360859]

Framework versions

  • Transformers 4.52.4
  • Pytorch 2.7.0+cu126
  • Datasets 3.6.0
  • Tokenizers 0.21.1
Downloads last month
210
Safetensors
Model size
3.72M params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for Jedrzej-Smok/2025-06-01_23-21-32

Base model

nvidia/mit-b0
Finetuned
(445)
this model