2025-06-07_17-20-54
This model is a fine-tuned version of nvidia/mit-b0 on the generator dataset. It achieves the following results on the evaluation set:
- Loss: 0.2248
- Mean Iou: 0.5704
- Mean Accuracy: 0.9255
- Overall Accuracy: 0.9361
- Per Category Iou: [0.935069480905877, 0.2056569754338517]
- Per Category Accuracy: [0.9365438001020377, 0.9143744676587003]
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 6e-05
- train_batch_size: 16
- eval_batch_size: 16
- seed: 42
- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: linear
- num_epochs: 50
Training results
Training Loss | Epoch | Step | Validation Loss | Mean Iou | Mean Accuracy | Overall Accuracy | Per Category Iou | Per Category Accuracy |
---|---|---|---|---|---|---|---|---|
0.7992 | 3.3333 | 20 | 0.6115 | 0.4037 | 0.8231 | 0.7504 | [0.7462764909230546, 0.06111183049253396] | [0.7476707078889173, 0.8985420111228017] |
0.6606 | 6.6667 | 40 | 0.5015 | 0.4845 | 0.9082 | 0.8609 | [0.8584006674256637, 0.11063117729822997] | [0.8590754054984118, 0.9573124906057418] |
0.5693 | 10.0 | 60 | 0.4045 | 0.5108 | 0.9200 | 0.8891 | [0.8871993879450994, 0.13440753734721594] | [0.887982707099186, 0.9520517059972945] |
0.4934 | 13.3333 | 80 | 0.3334 | 0.5260 | 0.9219 | 0.9036 | [0.9019458222052652, 0.15001278200293985] | [0.9029285560741499, 0.9408286988326069] |
0.4618 | 16.6667 | 100 | 0.3002 | 0.5338 | 0.9253 | 0.9101 | [0.908524638509248, 0.159114206080938] | [0.9095111877060243, 0.9410291096748334] |
0.3847 | 20.0 | 120 | 0.2632 | 0.5457 | 0.9237 | 0.9198 | [0.9184461886734042, 0.17298888162197515] | [0.9196697520926495, 0.9276516859562103] |
0.3854 | 23.3333 | 140 | 0.2536 | 0.5445 | 0.9270 | 0.9185 | [0.9170875595401002, 0.17191109474943628] | [0.9181705542788028, 0.9358685304874994] |
0.351 | 26.6667 | 160 | 0.2630 | 0.5463 | 0.9265 | 0.9200 | [0.9185796005322305, 0.17411135724233331] | [0.919707578006722, 0.9333132922491106] |
0.3659 | 30.0 | 180 | 0.2428 | 0.5625 | 0.9253 | 0.9313 | [0.9301410370970862, 0.19476299694189603] | [0.9315286374459942, 0.9189839170299113] |
0.3424 | 33.3333 | 200 | 0.2422 | 0.5547 | 0.9272 | 0.9259 | [0.9246171012006387, 0.1846854477872226] | [0.925833530919917, 0.9285535347462298] |
0.3278 | 36.6667 | 220 | 0.2334 | 0.5636 | 0.9266 | 0.9319 | [0.9307508979840385, 0.19651751929366998] | [0.9321024842399713, 0.9211383335838469] |
0.3375 | 40.0 | 240 | 0.2248 | 0.5626 | 0.9264 | 0.9313 | [0.930139062262491, 0.19514964975589047] | [0.9314880437821117, 0.9212385390049602] |
0.3145 | 43.3333 | 260 | 0.2187 | 0.5736 | 0.9224 | 0.9384 | [0.9373324958509626, 0.2099115208657486] | [0.9389591231030535, 0.9057568014429581] |
0.3089 | 46.6667 | 280 | 0.2231 | 0.5712 | 0.9237 | 0.9368 | [0.9357921282702163, 0.2067027027027027] | [0.9373418346306391, 0.9100656345508292] |
0.321 | 50.0 | 300 | 0.2248 | 0.5704 | 0.9255 | 0.9361 | [0.935069480905877, 0.2056569754338517] | [0.9365438001020377, 0.9143744676587003] |
Framework versions
- Transformers 4.52.4
- Pytorch 2.7.1+cu126
- Datasets 3.6.0
- Tokenizers 0.21.1
- Downloads last month
- 297
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support
Model tree for Jedrzej-Smok/2025-06-07_17-20-54
Base model
nvidia/mit-b0