2025-06-01_23-21-32

This model is a fine-tuned version of nvidia/mit-b0 on the generator dataset. It achieves the following results on the evaluation set:

Loss: 0.1614
Mean Iou: 0.5176
Mean Accuracy: 0.8876
Overall Accuracy: 0.9743
Per Category Iou: [0.9742468410406894, 0.06089158226797277]
Per Category Accuracy: [0.9746525149724692, 0.8004728992360859]

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 6e-05
train_batch_size: 16
eval_batch_size: 16
seed: 42
optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
num_epochs: 50

Training results

Training Loss	Epoch	Step	Validation Loss	Mean Iou	Mean Accuracy	Overall Accuracy	Per Category Iou	Per Category Accuracy
0.6021	3.3333	20	0.5909	0.4687	0.8817	0.9168	[0.9166466186982658, 0.020747239422429665]	[0.9169402794152367, 0.8464896325936704]
0.483	6.6667	40	0.4743	0.4702	0.8926	0.9188	[0.9187025683078979, 0.02174875913808613]	[0.9189588772375197, 0.8663150236449618]
0.4137	10.0	60	0.3513	0.4922	0.8785	0.9513	[0.9512098322489387, 0.03328872784134478]	[0.951596165043716, 0.8053837759185158]
0.3383	13.3333	80	0.2649	0.5071	0.8518	0.9683	[0.9682495182020971, 0.04604880717631906]	[0.9687853719602414, 0.7348126591487814]
0.2816	16.6667	100	0.2483	0.5149	0.8465	0.9745	[0.974434504920804, 0.05532738053519113]	[0.9750081799140786, 0.7178974172426337]
0.2936	20.0	120	0.2346	0.5096	0.8790	0.9690	[0.9689592451845151, 0.05033140256996599]	[0.9693866241133998, 0.7886504183339396]
0.2461	23.3333	140	0.1998	0.5145	0.8673	0.9732	[0.9731369716247265, 0.0557941058807841]	[0.9736223392504542, 0.7610040014550745]
0.2397	26.6667	160	0.1990	0.5092	0.8959	0.9679	[0.9677955220165496, 0.050647636518198695]	[0.968151855644824, 0.8235722080756639]
0.219	30.0	180	0.1780	0.5142	0.8876	0.9720	[0.9719682833343325, 0.056377244744169414]	[0.9723682122845229, 0.8028373954165151]
0.1903	33.3333	200	0.2008	0.5094	0.9029	0.9677	[0.9676405121966583, 0.051242685179004516]	[0.9679681397091366, 0.8377591851582393]
0.204	36.6667	220	0.1632	0.5199	0.8697	0.9765	[0.9764641430635311, 0.06329649091018905]	[0.9769482050118011, 0.7624590760276464]
0.27	40.0	240	0.1662	0.5158	0.8880	0.9731	[0.9730663517970184, 0.05851614101169792]	[0.9734674712716104, 0.8024736267733721]
0.1853	43.3333	260	0.1612	0.5195	0.8778	0.9759	[0.9758665197616132, 0.06309928858645221]	[0.9763162070099016, 0.7791924336122227]
0.2103	46.6667	280	0.1644	0.5161	0.8921	0.9731	[0.9730533605611293, 0.05906435072936126]	[0.9734374845796283, 0.8108403055656602]
0.2376	50.0	300	0.1614	0.5176	0.8876	0.9743	[0.9742468410406894, 0.06089158226797277]	[0.9746525149724692, 0.8004728992360859]

Framework versions

Transformers 4.52.4
Pytorch 2.7.0+cu126
Datasets 3.6.0
Tokenizers 0.21.1

Jedrzej-Smok
/

2025-06-01_23-21-32

2025-06-01_23-21-32

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for Jedrzej-Smok/2025-06-01_23-21-32

Evaluation results