2025-06-07_17-20-54

This model is a fine-tuned version of nvidia/mit-b0 on the generator dataset. It achieves the following results on the evaluation set:

  • Loss: 0.2248
  • Mean Iou: 0.5704
  • Mean Accuracy: 0.9255
  • Overall Accuracy: 0.9361
  • Per Category Iou: [0.935069480905877, 0.2056569754338517]
  • Per Category Accuracy: [0.9365438001020377, 0.9143744676587003]

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 6e-05
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • num_epochs: 50

Training results

Training Loss Epoch Step Validation Loss Mean Iou Mean Accuracy Overall Accuracy Per Category Iou Per Category Accuracy
0.7992 3.3333 20 0.6115 0.4037 0.8231 0.7504 [0.7462764909230546, 0.06111183049253396] [0.7476707078889173, 0.8985420111228017]
0.6606 6.6667 40 0.5015 0.4845 0.9082 0.8609 [0.8584006674256637, 0.11063117729822997] [0.8590754054984118, 0.9573124906057418]
0.5693 10.0 60 0.4045 0.5108 0.9200 0.8891 [0.8871993879450994, 0.13440753734721594] [0.887982707099186, 0.9520517059972945]
0.4934 13.3333 80 0.3334 0.5260 0.9219 0.9036 [0.9019458222052652, 0.15001278200293985] [0.9029285560741499, 0.9408286988326069]
0.4618 16.6667 100 0.3002 0.5338 0.9253 0.9101 [0.908524638509248, 0.159114206080938] [0.9095111877060243, 0.9410291096748334]
0.3847 20.0 120 0.2632 0.5457 0.9237 0.9198 [0.9184461886734042, 0.17298888162197515] [0.9196697520926495, 0.9276516859562103]
0.3854 23.3333 140 0.2536 0.5445 0.9270 0.9185 [0.9170875595401002, 0.17191109474943628] [0.9181705542788028, 0.9358685304874994]
0.351 26.6667 160 0.2630 0.5463 0.9265 0.9200 [0.9185796005322305, 0.17411135724233331] [0.919707578006722, 0.9333132922491106]
0.3659 30.0 180 0.2428 0.5625 0.9253 0.9313 [0.9301410370970862, 0.19476299694189603] [0.9315286374459942, 0.9189839170299113]
0.3424 33.3333 200 0.2422 0.5547 0.9272 0.9259 [0.9246171012006387, 0.1846854477872226] [0.925833530919917, 0.9285535347462298]
0.3278 36.6667 220 0.2334 0.5636 0.9266 0.9319 [0.9307508979840385, 0.19651751929366998] [0.9321024842399713, 0.9211383335838469]
0.3375 40.0 240 0.2248 0.5626 0.9264 0.9313 [0.930139062262491, 0.19514964975589047] [0.9314880437821117, 0.9212385390049602]
0.3145 43.3333 260 0.2187 0.5736 0.9224 0.9384 [0.9373324958509626, 0.2099115208657486] [0.9389591231030535, 0.9057568014429581]
0.3089 46.6667 280 0.2231 0.5712 0.9237 0.9368 [0.9357921282702163, 0.2067027027027027] [0.9373418346306391, 0.9100656345508292]
0.321 50.0 300 0.2248 0.5704 0.9255 0.9361 [0.935069480905877, 0.2056569754338517] [0.9365438001020377, 0.9143744676587003]

Framework versions

  • Transformers 4.52.4
  • Pytorch 2.7.1+cu126
  • Datasets 3.6.0
  • Tokenizers 0.21.1
Downloads last month
297
Safetensors
Model size
3.72M params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for Jedrzej-Smok/2025-06-07_17-20-54

Base model

nvidia/mit-b0
Finetuned
(445)
this model