Cascade0

My first ever LLM trained on a single RTX 4080 locally, in 1.5 - 2 weeks. Altough its small (159M) and it cannot answer direct questions (What's the capital of France?) it can absolutely complete sentences coherently and correctly. Only thing to mention is that it currently outputs everything in lowercase (due to training bug).

GPT2 vs Cascade0

Both models are similar in size (161M for GPT2) (159M for Cascade0), and same F16 Quant. Eg response Both models hallucinate after the second turn in one chat.

According to Gemini 2.5 Flash after analyzing the responses, its verdict was:

This project started in May 2025. Code for training is AI generated, BUT it took a lot of human effort (Rather debugging and prompt engineering) to reach this state, including lots of trial and error, AI changing (from GPT-Gemini-Deepsek) wasted electricity in training and time... And lots of furstration. It was only recently when i bought ChatGPT Plus when I could this pull off, after almost abandoning everything. But after all, this is my dream, and I just feel good when I see this on my page. <3

Downloads last month: 11

Safetensors

Model size

0.2B params

Tensor type

BF16

Model tree for ARMZyany/Cascade0-159M-Instruct-45k

Quantizations

1 model

Datasets used to train ARMZyany/Cascade0-159M-Instruct-45k

Collection including ARMZyany/Cascade0-159M-Instruct-45k

Cascade0-Series

Collection

First Cascade Series. • 3 items • Updated 27 days ago