Luigi
/

SmolLM2-360M-Instruct-TaiwanChat

Generated from Trainer

Model card Files Files and versions

SmolLM2-360M-Instruct-TaiwanChat / train_with_unsloth.py

Commit History

adjust hyper-parameters

976c215

Luigi commited on May 1

train on 800k examples

38e2b45

Luigi commited on May 1

train with whole dataset

39559f2

Luigi commited on May 1

show also examples leading to infinite eval loss

9c94812

Luigi commited on May 1

update train script

fc65dac

Luigi commited on Apr 30

update train script

4bf72b9

Luigi commited on Apr 29

update train script

c285ad3

Luigi commited on Apr 28

do not generation prompt in the end, irrelavent for training and evaluation

36395a3

Luigi commited on Apr 28

decrease val size

d874b94

Luigi commited on Apr 28

re-implement dataset filtering in more efficient way

afa5f94

Luigi commited on Apr 28

bugfix

69d7616

Luigi commited on Apr 28

filter out samples too long for MAX_LEN from dataset

144d876

Luigi commited on Apr 28

eval loss got NAN but train loss kepp finite, adjust hyper-parameters

aa49cd2

Luigi commited on Apr 28

adjust train parameters to prevent from underfitting

7d54ecf

Luigi commited on Apr 27

initial commit

a5af3c2

Luigi commited on Apr 27