cgato
/

Nemo12b-TheSyntheticOne

Model card Files Files and versions Community

Trained using https://huggingface.co/datasets/cgato/TheSmarts for demonstration purposes. Probably a fairly competent assistant model. May do KTO overtop to sand down the edges and improve performance later.

Prompt Format: ChatML

Roles: system, user, assistant

Training results

Training Loss	Epoch	Step	Validation Loss
0.9069	0.0007	1	0.9335
0.8022	0.0656	100	0.7534
0.7916	0.1311	200	0.7327
0.7568	0.1967	300	0.7209
0.8309	0.2623	400	0.7130
0.7026	0.3279	500	0.7058
0.7391	0.3934	600	0.6993
0.7426	0.4590	700	0.6930
0.6546	0.5246	800	0.6878
0.7402	0.5902	900	0.6827
0.6857	0.6557	1000	0.6780
0.6156	0.7213	1100	0.6743
0.6275	0.7869	1200	0.6712
0.6259	0.8525	1300	0.6684
0.6702	0.9180	1400	0.6665
0.6835	0.9836	1500	0.6649

Downloads last month: 6

Safetensors

Model size

12.2B params

Tensor type

BF16

·

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for cgato/Nemo12b-TheSyntheticOne

Quantizations

1 model