Cascade0

My first ever LLM trained on a single RTX 4080 locally, in 1.5 - 2 weeks. Altough its small (159M) and it cannot answer direct questions (What's the capital of France?) it can absolutely complete sentences coherently and correctly. Only thing to mention is that it currently outputs everything in lowercase (due to training bug).

GPT2 vs Cascade0

Both models are similar in size (161M for GPT2) (159M for Cascade0), and same F16 Quant. Eg response image/png Both models hallucinate after the second turn in one chat. image/png image/png

According to Gemini 2.5 Flash after analyzing the responses, its verdict was: image/png

This project started in May 2025. Code for training is AI generated, BUT it took a lot of human effort (Rather debugging and prompt engineering) to reach this state, including lots of trial and error, AI changing (from GPT-Gemini-Deepsek) wasted electricity in training and time... And lots of furstration. It was only recently when i bought ChatGPT Plus when I could this pull off, after almost abandoning everything. But after all, this is my dream, and I just feel good when I see this on my page. <3

Downloads last month
11
Safetensors
Model size
159M params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for ARMZyany/Cascade0-159M-Instruct-45k

Quantizations
1 model

Datasets used to train ARMZyany/Cascade0-159M-Instruct-45k

Collection including ARMZyany/Cascade0-159M-Instruct-45k