TRI-ML
/

DCLM-1B

Model card Files Files and versions

Resources

View closed (2)

Is this model supported for finetuning with flash attention ?

#4 opened about 2 months ago by

MMLU Performance After Token Training

#3 opened 12 months ago by