Llamafyd version of Qwen .5B further fine tuned on wiki, math, science, and chat datasets. Based on Cinder data. This model should be fine tuned on further rag, function calling, programing, or assistant datasets for best performance. Next model will have a focus on rag.

This model is ok at rag. It is very verbose from being trained on wikipedia Q and A with a whole article as the answer. Tiny-textbooks and Cosmopedia 100k, all very long responses. It was also trained with normal RAG datasets, as well as a medical rag dataset I put together. Most of the common math chat datasets. Conversation datasets like hermes 1, fastchat, synthia, capybara, cinder, puffin, ect. I will work on putting together the full list and posting.

Downloads last month
13
Safetensors
Model size
464M params
Tensor type
F16
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support