AI & ML interests
LLM training in simple, pure C/CUDA
Recent Activity
View all activity
Fun experiments with llm.c
-
yuchenj/gpt2_124M_100B_FinewebEdu_hf
Text Generation • 0.1B • Updated • 28 -
yuchenj/gpt2_350M_100B_FinewebEdu_hf
Text Generation • 0.4B • Updated • 14 -
yuchenj/gpt2_774M_100B_FinewebEdu_hf
Text Generation • 0.8B • Updated • 17 • 1 -
yuchenj/gpt2_1558M_100B_FinewebEdu_hf
Text Generation • 2B • Updated • 11 • 1
Fun experiments with llm.c
-
yuchenj/gpt2_124M_100B_FinewebEdu_hf
Text Generation • 0.1B • Updated • 28 -
yuchenj/gpt2_350M_100B_FinewebEdu_hf
Text Generation • 0.4B • Updated • 14 -
yuchenj/gpt2_774M_100B_FinewebEdu_hf
Text Generation • 0.8B • Updated • 17 • 1 -
yuchenj/gpt2_1558M_100B_FinewebEdu_hf
Text Generation • 2B • Updated • 11 • 1
Exploring to find a successful training recipe interleaving pre-training with instruction data