dhivehi-nougat-small-text-sen-multiline

This model is a fine-tuned version of facebook/nougat-small on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 3
eval_batch_size: 3
seed: 42
gradient_accumulation_steps: 6
total_train_batch_size: 18
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
num_epochs: 100

Training Loss	Epoch	Step	Validation Loss
4.9984	0.1915	500	0.7656
4.4373	0.3830	1000	0.6860
4.1498	0.5746	1500	0.6523
4.0507	0.7661	2000	0.6342
4.04	0.9576	2500	0.6209
3.9678	1.1490	3000	0.6135
3.9341	1.3405	3500	0.6075
3.8322	1.5320	4000	0.6028
3.864	1.7236	4500	0.6002
3.8866	1.9151	5000	0.5972
3.8513	2.1065	5500	0.5944
3.7987	2.2980	6000	0.5930
3.85	2.4895	6500	0.5911
3.7525	2.6811	7000	0.5901
3.8161	2.8726	7500	0.5884
3.774	3.0640	8000	0.5879
3.8582	3.2555	8500	0.5869
3.8049	3.4470	9000	0.5855
3.7704	3.6385	9500	0.5855
3.8524	3.8301	10000	0.5844
3.7806	4.0215	10500	0.5839
3.7578	4.2130	11000	0.5834
3.7702	4.4045	11500	0.5829
3.7018	4.5960	12000	0.5827
3.7466	4.7875	12500	0.5816
3.7695	4.9791	13000	0.5816
3.8066	5.1705	13500	0.5811
3.7632	5.3620	14000	0.5813
3.7906	5.5535	14500	0.5801
3.7567	5.7450	15000	0.5805
3.7465	5.9365	15500	0.5802
3.7318	6.1279	16000	0.5797
3.7349	6.3195	16500	0.5792
3.724	6.5110	17000	0.5795
3.7208	6.7025	17500	0.5793
3.7877	6.8940	18000	0.5788
3.8067	7.0854	18500	0.5788
3.7721	7.2769	19000	0.5782
3.7535	7.4685	19500	0.5781
3.7339	7.6600	20000	0.5778
3.7472	7.8515	20500	0.5784
3.7907	8.0429	21000	0.5780
3.7457	8.2344	21500	0.5778
3.7464	8.4259	22000	0.5777
3.7859	8.6175	22500	0.5771
3.7792	8.8090	23000	0.5775
3.4678	9.0004	23500	0.5773
3.734	9.1919	24000	0.5769
3.7741	9.3834	24500	0.5770
3.8595	9.5749	25000	0.5766
3.7799	9.7665	25500	0.5767
3.6788	9.9580	26000	0.5768
3.7228	10.1494	26500	0.5766
3.7604	10.3409	27000	0.5763
3.7169	10.5324	27500	0.5765
3.731	10.7240	28000	0.5765
3.7575	10.9155	28500	0.5760
3.9147	11.1069	29000	0.5759
3.6776	11.2984	29500	0.5762
3.7124	11.4899	30000	0.5763
3.7571	11.6814	30500	0.5761