2025-06-07_17-20-54

This model is a fine-tuned version of nvidia/mit-b0 on the generator dataset. It achieves the following results on the evaluation set:

Loss: 0.2248
Mean Iou: 0.5704
Mean Accuracy: 0.9255
Overall Accuracy: 0.9361
Per Category Iou: [0.935069480905877, 0.2056569754338517]
Per Category Accuracy: [0.9365438001020377, 0.9143744676587003]

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 6e-05
train_batch_size: 16
eval_batch_size: 16
seed: 42
optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
num_epochs: 50

Training results

Training Loss	Epoch	Step	Validation Loss	Mean Iou	Mean Accuracy	Overall Accuracy	Per Category Iou	Per Category Accuracy
0.7992	3.3333	20	0.6115	0.4037	0.8231	0.7504	[0.7462764909230546, 0.06111183049253396]	[0.7476707078889173, 0.8985420111228017]
0.6606	6.6667	40	0.5015	0.4845	0.9082	0.8609	[0.8584006674256637, 0.11063117729822997]	[0.8590754054984118, 0.9573124906057418]
0.5693	10.0	60	0.4045	0.5108	0.9200	0.8891	[0.8871993879450994, 0.13440753734721594]	[0.887982707099186, 0.9520517059972945]
0.4934	13.3333	80	0.3334	0.5260	0.9219	0.9036	[0.9019458222052652, 0.15001278200293985]	[0.9029285560741499, 0.9408286988326069]
0.4618	16.6667	100	0.3002	0.5338	0.9253	0.9101	[0.908524638509248, 0.159114206080938]	[0.9095111877060243, 0.9410291096748334]
0.3847	20.0	120	0.2632	0.5457	0.9237	0.9198	[0.9184461886734042, 0.17298888162197515]	[0.9196697520926495, 0.9276516859562103]
0.3854	23.3333	140	0.2536	0.5445	0.9270	0.9185	[0.9170875595401002, 0.17191109474943628]	[0.9181705542788028, 0.9358685304874994]
0.351	26.6667	160	0.2630	0.5463	0.9265	0.9200	[0.9185796005322305, 0.17411135724233331]	[0.919707578006722, 0.9333132922491106]
0.3659	30.0	180	0.2428	0.5625	0.9253	0.9313	[0.9301410370970862, 0.19476299694189603]	[0.9315286374459942, 0.9189839170299113]
0.3424	33.3333	200	0.2422	0.5547	0.9272	0.9259	[0.9246171012006387, 0.1846854477872226]	[0.925833530919917, 0.9285535347462298]
0.3278	36.6667	220	0.2334	0.5636	0.9266	0.9319	[0.9307508979840385, 0.19651751929366998]	[0.9321024842399713, 0.9211383335838469]
0.3375	40.0	240	0.2248	0.5626	0.9264	0.9313	[0.930139062262491, 0.19514964975589047]	[0.9314880437821117, 0.9212385390049602]
0.3145	43.3333	260	0.2187	0.5736	0.9224	0.9384	[0.9373324958509626, 0.2099115208657486]	[0.9389591231030535, 0.9057568014429581]
0.3089	46.6667	280	0.2231	0.5712	0.9237	0.9368	[0.9357921282702163, 0.2067027027027027]	[0.9373418346306391, 0.9100656345508292]
0.321	50.0	300	0.2248	0.5704	0.9255	0.9361	[0.935069480905877, 0.2056569754338517]	[0.9365438001020377, 0.9143744676587003]

Framework versions

Transformers 4.52.4
Pytorch 2.7.1+cu126
Datasets 3.6.0
Tokenizers 0.21.1

Jedrzej-Smok
/

2025-06-07_17-20-54

2025-06-07_17-20-54

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for Jedrzej-Smok/2025-06-07_17-20-54

Evaluation results