squarerun_large_model

This model is a fine-tuned version of google/vit-large-patch16-224 on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 1.5150
F1 Macro: 0.4837
F1 Micro: 0.5909
F1 Weighted: 0.5569
Precision Macro: 0.5183
Precision Micro: 0.5909
Precision Weighted: 0.5764
Recall Macro: 0.5013
Recall Micro: 0.5909
Recall Weighted: 0.5909
Accuracy: 0.5909

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0001
train_batch_size: 8
eval_batch_size: 8
seed: 42
gradient_accumulation_steps: 2
total_train_batch_size: 16
optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
lr_scheduler_warmup_ratio: 0.1
num_epochs: 25

Training results

Training Loss	Epoch	Step	Validation Loss	F1 Macro	F1 Micro	F1 Weighted	Precision Macro	Precision Micro	Precision Weighted	Recall Macro	Recall Micro	Recall Weighted	Accuracy
1.917	1.0	29	1.9115	0.1066	0.2197	0.1273	0.0780	0.2197	0.0923	0.1832	0.2197	0.2197	0.2197
1.6762	2.0	58	1.6722	0.2733	0.3561	0.3005	0.3141	0.3561	0.3684	0.3355	0.3561	0.3561	0.3561
1.9664	3.0	87	1.5057	0.3554	0.4545	0.4060	0.3734	0.4545	0.4129	0.3857	0.4545	0.4545	0.4545
1.1934	4.0	116	1.4217	0.3130	0.4091	0.3530	0.3414	0.4091	0.3818	0.3629	0.4091	0.4091	0.4091
1.0968	5.0	145	1.1879	0.4608	0.5758	0.5258	0.4807	0.5758	0.5438	0.5045	0.5758	0.5758	0.5758
1.1313	6.0	174	1.2307	0.4964	0.5530	0.5243	0.5850	0.5530	0.6114	0.5196	0.5530	0.5530	0.5530
1.0807	7.0	203	1.2771	0.4088	0.5303	0.4772	0.5393	0.5303	0.5816	0.4304	0.5303	0.5303	0.5303
1.1825	8.0	232	1.2339	0.4528	0.5682	0.5175	0.5544	0.5682	0.6169	0.4920	0.5682	0.5682	0.5682
0.4454	9.0	261	1.0474	0.6064	0.6970	0.6763	0.6334	0.6970	0.6868	0.6100	0.6970	0.6970	0.6970
0.5439	10.0	290	1.6815	0.4580	0.5152	0.4920	0.5394	0.5152	0.5951	0.4903	0.5152	0.5152	0.5152
0.4256	11.0	319	1.1378	0.5800	0.6667	0.6495	0.5801	0.6667	0.6435	0.5907	0.6667	0.6667	0.6667
0.4968	12.0	348	1.4229	0.5307	0.6136	0.6013	0.5348	0.6136	0.6095	0.5486	0.6136	0.6136	0.6136
0.3408	13.0	377	1.4445	0.5426	0.6288	0.6095	0.5559	0.6288	0.6307	0.5621	0.6288	0.6288	0.6288
0.2914	14.0	406	1.4277	0.6009	0.6515	0.6470	0.7068	0.6515	0.6868	0.5958	0.6515	0.6515	0.6515
0.2003	15.0	435	1.5517	0.5770	0.6288	0.6296	0.5890	0.6288	0.6475	0.5792	0.6288	0.6288	0.6288
0.0871	16.0	464	1.4812	0.5702	0.6515	0.6407	0.5777	0.6515	0.6491	0.5785	0.6515	0.6515	0.6515
0.0352	17.0	493	2.1052	0.5007	0.5985	0.5744	0.5466	0.5985	0.6130	0.5127	0.5985	0.5985	0.5985
0.0101	18.0	522	1.9978	0.5725	0.6212	0.6223	0.6152	0.6212	0.6559	0.5672	0.6212	0.6212	0.6212
0.0035	19.0	551	2.0304	0.5880	0.6439	0.6388	0.6698	0.6439	0.6936	0.5805	0.6439	0.6439	0.6439
0.0013	20.0	580	2.1374	0.5514	0.6364	0.6224	0.6025	0.6364	0.6765	0.5685	0.6364	0.6364	0.6364
0.0589	21.0	609	1.7676	0.5879	0.6439	0.6396	0.5940	0.6439	0.6407	0.5889	0.6439	0.6439	0.6439
0.0263	22.0	638	1.8416	0.5785	0.6439	0.6327	0.6016	0.6439	0.6454	0.5758	0.6439	0.6439	0.6439
0.0028	23.0	667	1.9843	0.6068	0.6667	0.6569	0.6631	0.6667	0.6882	0.6069	0.6667	0.6667	0.6667
0.0006	24.0	696	1.9432	0.6157	0.6742	0.6655	0.6603	0.6742	0.6853	0.6152	0.6742	0.6742	0.6742
0.0004	25.0	725	1.9346	0.6089	0.6667	0.6569	0.6548	0.6667	0.6763	0.6073	0.6667	0.6667	0.6667

Framework versions

Transformers 4.48.2
Pytorch 2.6.0+cu124
Datasets 3.2.0
Tokenizers 0.21.0

corranm
/

squarerun_large_model

squarerun_large_model

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for corranm/squarerun_large_model

Evaluation results