synthAIze_telugu_colloquial_trans

This model is a fine-tuned version of unsloth/tinyllama-chat-bnb-4bit on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 9.3740

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0003
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 16
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 2
  • num_epochs: 7
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss
14.5924 0.0261 4 9.4984
14.7094 0.0521 8 9.4984
14.6701 0.0782 12 9.4984
12.8211 0.1042 16 9.4985
9.5509 0.1303 20 9.4985
8.5853 0.1564 24 9.4986
5.5634 0.1824 28 9.4977
4.6341 0.2085 32 9.4931
4.2825 0.2345 36 9.4830
4.1122 0.2606 40 9.4710
4.1675 0.2866 44 9.4615
4.0636 0.3127 48 9.4545
3.9839 0.3388 52 9.4433
3.9887 0.3648 56 9.4321
3.9356 0.3909 60 9.4303
3.9948 0.4169 64 9.4278
3.9081 0.4430 68 9.4299
3.9618 0.4691 72 9.4265
3.988 0.4951 76 9.4219
4.0317 0.5212 80 9.4220
3.9991 0.5472 84 9.4183
3.8728 0.5733 88 9.4186
4.0057 0.5993 92 9.4189
3.9544 0.6254 96 9.4179
3.9929 0.6515 100 9.4196
3.9485 0.6775 104 9.4146
3.9823 0.7036 108 9.4137
3.9122 0.7296 112 9.4128
4.0261 0.7557 116 9.4095
3.91 0.7818 120 9.4123
4.0387 0.8078 124 9.4124
3.9793 0.8339 128 9.4130
3.8188 0.8599 132 9.4120
3.909 0.8860 136 9.4108
3.9257 0.9121 140 9.4136
3.9342 0.9381 144 9.4145
3.9592 0.9642 148 9.4175
3.9377 0.9902 152 9.4205
3.8464 1.0130 156 9.4188
3.9341 1.0391 160 9.4166
3.9102 1.0651 164 9.4179
3.9182 1.0912 168 9.4212
3.9517 1.1173 172 9.4220
3.9039 1.1433 176 9.4219
3.9169 1.1694 180 9.4230
3.8886 1.1954 184 9.4186
3.944 1.2215 188 9.4140
3.8584 1.2476 192 9.4161
3.8801 1.2736 196 9.4185
3.87 1.2997 200 9.4145
3.8289 1.3257 204 9.4100
3.8811 1.3518 208 9.4077
3.9058 1.3779 212 9.4038
3.8277 1.4039 216 9.4035
3.9422 1.4300 220 9.4021
3.8817 1.4560 224 9.4000
3.8859 1.4821 228 9.4018
3.9168 1.5081 232 9.4073
3.8357 1.5342 236 9.4063
3.906 1.5603 240 9.4013
3.886 1.5863 244 9.4047
3.8844 1.6124 248 9.4063
3.9357 1.6384 252 9.4055
3.8253 1.6645 256 9.4054
3.7222 1.6906 260 9.4068
3.9175 1.7166 264 9.4095
3.9249 1.7427 268 9.4093
3.9101 1.7687 272 9.4074
3.771 1.7948 276 9.4072
3.7613 1.8208 280 9.4064
3.9083 1.8469 284 9.4088
3.7913 1.8730 288 9.4087
3.8821 1.8990 292 9.4053
3.8123 1.9251 296 9.4038
3.8701 1.9511 300 9.4016
3.9222 1.9772 304 9.4022
3.8515 2.0 308 9.4044
3.746 2.0261 312 9.4035
3.7712 2.0521 316 9.4025
3.8676 2.0782 320 9.4002
3.8346 2.1042 324 9.3997
3.7751 2.1303 328 9.4002
3.8502 2.1564 332 9.4001
3.7592 2.1824 336 9.4002
3.8953 2.2085 340 9.3968
3.7921 2.2345 344 9.3946
3.8732 2.2606 348 9.3939
3.7933 2.2866 352 9.3922
3.8393 2.3127 356 9.3874
3.7572 2.3388 360 9.3874
3.856 2.3648 364 9.3885
3.842 2.3909 368 9.3881
3.891 2.4169 372 9.3874
3.7209 2.4430 376 9.3871
3.9075 2.4691 380 9.3875
3.8392 2.4951 384 9.3872
3.8123 2.5212 388 9.3876
3.8181 2.5472 392 9.3898
3.8469 2.5733 396 9.3908
3.9016 2.5993 400 9.3886
3.7321 2.6254 404 9.3852
3.807 2.6515 408 9.3849
3.8461 2.6775 412 9.3868
3.6914 2.7036 416 9.3895
3.8299 2.7296 420 9.3899
3.7647 2.7557 424 9.3909
3.828 2.7818 428 9.3918
3.8529 2.8078 432 9.3928
3.809 2.8339 436 9.3934
3.8223 2.8599 440 9.3936
3.7237 2.8860 444 9.3916
3.8755 2.9121 448 9.3904
3.8432 2.9381 452 9.3901
3.6971 2.9642 456 9.3897
3.8482 2.9902 460 9.3922
3.7642 3.0130 464 9.3940
3.7789 3.0391 468 9.3948
3.6887 3.0651 472 9.3940
3.8469 3.0912 476 9.3933
3.8252 3.1173 480 9.3926
3.8943 3.1433 484 9.3917
3.7709 3.1694 488 9.3928
3.7592 3.1954 492 9.3922
3.6 3.2215 496 9.3901
3.8423 3.2476 500 9.3903
3.7515 3.2736 504 9.3885
3.7935 3.2997 508 9.3876
3.7854 3.3257 512 9.3883
3.771 3.3518 516 9.3910
3.7701 3.3779 520 9.3934
3.9061 3.4039 524 9.3925
3.8134 3.4300 528 9.3894
3.7618 3.4560 532 9.3885
3.7996 3.4821 536 9.3885
3.6992 3.5081 540 9.3861
3.7507 3.5342 544 9.3859
3.738 3.5603 548 9.3844
3.7863 3.5863 552 9.3817
3.7974 3.6124 556 9.3769
3.722 3.6384 560 9.3766
3.8479 3.6645 564 9.3818
3.76 3.6906 568 9.3866
3.805 3.7166 572 9.3876
3.831 3.7427 576 9.3865
3.8149 3.7687 580 9.3854
3.7358 3.7948 584 9.3846
3.7091 3.8208 588 9.3824
3.7404 3.8469 592 9.3805
3.833 3.8730 596 9.3805
3.7432 3.8990 600 9.3810
3.7569 3.9251 604 9.3815
3.8761 3.9511 608 9.3816
3.7284 3.9772 612 9.3806
3.8089 4.0 616 9.3807
3.6199 4.0261 620 9.3816
3.7248 4.0521 624 9.3830
3.6517 4.0782 628 9.3833
3.7501 4.1042 632 9.3809
3.7044 4.1303 636 9.3792
3.7368 4.1564 640 9.3793
3.6834 4.1824 644 9.3791
3.8192 4.2085 648 9.3787
3.7202 4.2345 652 9.3805
3.7935 4.2606 656 9.3826
3.6746 4.2866 660 9.3834
3.7561 4.3127 664 9.3834
3.6912 4.3388 668 9.3837
3.7322 4.3648 672 9.3860
3.6658 4.3909 676 9.3871
3.7631 4.4169 680 9.3871
3.7579 4.4430 684 9.3872
3.793 4.4691 688 9.3866
3.6951 4.4951 692 9.3843
3.7384 4.5212 696 9.3826
3.634 4.5472 700 9.3813
3.7035 4.5733 704 9.3815
3.7729 4.5993 708 9.3821
3.763 4.6254 712 9.3815
3.6896 4.6515 716 9.3813
3.7827 4.6775 720 9.3818
3.7243 4.7036 724 9.3814
3.6214 4.7296 728 9.3809
3.7144 4.7557 732 9.3796
3.8698 4.7818 736 9.3781
3.8257 4.8078 740 9.3772
3.784 4.8339 744 9.3766
3.7899 4.8599 748 9.3775
3.7396 4.8860 752 9.3784
3.7271 4.9121 756 9.3782
3.807 4.9381 760 9.3783
3.7314 4.9642 764 9.3787
3.811 4.9902 768 9.3795
3.7654 5.0130 772 9.3790
3.7322 5.0391 776 9.3787
3.8652 5.0651 780 9.3792
3.6246 5.0912 784 9.3788
3.6057 5.1173 788 9.3783
3.5923 5.1433 792 9.3776
3.6224 5.1694 796 9.3766
3.6847 5.1954 800 9.3756
3.7316 5.2215 804 9.3757
3.597 5.2476 808 9.3761
3.6709 5.2736 812 9.3751
3.5557 5.2997 816 9.3740
3.6442 5.3257 820 9.3727
3.7169 5.3518 824 9.3737
3.6234 5.3779 828 9.3741
3.6838 5.4039 832 9.3742
3.6614 5.4300 836 9.3750
3.7273 5.4560 840 9.3758
3.7639 5.4821 844 9.3758
3.8223 5.5081 848 9.3755
3.7424 5.5342 852 9.3765
3.6588 5.5603 856 9.3764
3.6917 5.5863 860 9.3762
3.7315 5.6124 864 9.3766
3.7366 5.6384 868 9.3764
3.5316 5.6645 872 9.3761
3.6727 5.6906 876 9.3756
3.6911 5.7166 880 9.3752
3.6286 5.7427 884 9.3746
3.7405 5.7687 888 9.3751
3.6738 5.7948 892 9.3759
3.7719 5.8208 896 9.3762
3.6128 5.8469 900 9.3759
3.7748 5.8730 904 9.3759
3.5543 5.8990 908 9.3763
3.6143 5.9251 912 9.3761
3.7183 5.9511 916 9.3765
3.6779 5.9772 920 9.3767
3.7828 6.0 924 9.3764
3.621 6.0261 928 9.3763
3.6744 6.0521 932 9.3768
3.6675 6.0782 936 9.3772
3.6785 6.1042 940 9.3777
3.7176 6.1303 944 9.3785
3.6924 6.1564 948 9.3788
3.5211 6.1824 952 9.3789
3.7995 6.2085 956 9.3789
3.6982 6.2345 960 9.3786
3.7288 6.2606 964 9.3779
3.7139 6.2866 968 9.3770
3.564 6.3127 972 9.3764
3.7078 6.3388 976 9.3760
3.6165 6.3648 980 9.3760
3.5605 6.3909 984 9.3758
3.7073 6.4169 988 9.3757
3.5997 6.4430 992 9.3756
3.444 6.4691 996 9.3756
3.6781 6.4951 1000 9.3757
3.6395 6.5212 1004 9.3759
3.5381 6.5472 1008 9.3761
3.7774 6.5733 1012 9.3757
3.6378 6.5993 1016 9.3752
3.6952 6.6254 1020 9.3749
3.6099 6.6515 1024 9.3746
3.7006 6.6775 1028 9.3743
3.7079 6.7036 1032 9.3743
3.7302 6.7296 1036 9.3743
3.6339 6.7557 1040 9.3745
3.5705 6.7818 1044 9.3744
3.6991 6.8078 1048 9.3742
3.7199 6.8339 1052 9.3740
3.7193 6.8599 1056 9.3739
3.5579 6.8860 1060 9.3739
3.7278 6.9121 1064 9.3739
3.5131 6.9381 1068 9.3740

Framework versions

  • PEFT 0.14.0
  • Transformers 4.48.3
  • Pytorch 2.6.0+cu124
  • Datasets 3.3.1
  • Tokenizers 0.21.0
Downloads last month
28
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for sril32996/synthAIze_telugu_colloquial_trans

Adapter
(112)
this model