finetune_colqwen

This model is a fine-tuned version of vidore/colqwen2.5-base on the None dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 4
eval_batch_size: 4
seed: 42
gradient_accumulation_steps: 4
total_train_batch_size: 16
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 100
num_epochs: 1.5

Training Loss	Epoch	Step	Validation Loss	Model Preparation Time
No log	0.0005	1	0.0503	0.0325
0.0615	0.0452	100	0.0245	0.0325
0.0746	0.0905	200	0.0205	0.0325
0.0302	0.1357	300	0.0194	0.0325
0.103	0.1809	400	0.0179	0.0325
0.0972	0.2262	500	0.0161	0.0325
0.1049	0.2714	600	0.0155	0.0325
0.0934	0.3166	700	0.0161	0.0325
0.0659	0.3619	800	0.0153	0.0325
0.0677	0.4071	900	0.0153	0.0325
0.0114	0.4523	1000	0.0136	0.0325
0.0446	0.4976	1100	0.0131	0.0325
0.0299	0.5428	1200	0.0126	0.0325
0.0268	0.5880	1300	0.0126	0.0325
0.0126	0.6333	1400	0.0118	0.0325
0.0845	0.6785	1500	0.0116	0.0325
0.0344	0.7237	1600	0.0115	0.0325
0.145	0.7690	1700	0.0113	0.0325
0.028	0.8142	1800	0.0110	0.0325
0.024	0.8594	1900	0.0109	0.0325
0.0207	0.9047	2000	0.0106	0.0325
0.0171	0.9499	2100	0.0105	0.0325
0.0413	0.9951	2200	0.0104	0.0325
0.0105	1.0407	2300	0.0104	0.0325
0.0064	1.0859	2400	0.0103	0.0325
0.0372	1.1312	2500	0.0102	0.0325
0.0289	1.1764	2600	0.0102	0.0325
0.0117	1.2216	2700	0.0101	0.0325
0.0217	1.2669	2800	0.0101	0.0325
0.0361	1.3121	2900	0.0102	0.0325
0.0283	1.3573	3000	0.0100	0.0325
0.0335	1.4026	3100	0.0101	0.0325
0.0143	1.4478	3200	0.0101	0.0325
0.0354	1.4930	3300	0.0101	0.0325