--- base_model: - google/gemma-3-4b-pt pipeline_tag: text-generation library_name: transformers --- # **thinkygemma-4b: your average fake reasoner** Fine-tuned from **Gemma-3-4b-pt** 📌 **Model ID:** `xsanskarx/thinkygemma-4b` 📌 **Parameters trained:** **1.8 billion** 📌 **Trained on:** **25k rows of verified Chain-of-Thought (CoT) traces** from **DeepSeek R1** and **Qwen QWQ** 📌 **Next planned step:** **GRPO** 📌 **adapters repo:** `xsanskarx/thinkgemma-4b` --- ## **Model Description** This is a **fine-tuned version of Google's Gemma-3-4b-it**, adapted for **structured reasoning / fake induced reasoning **. It is designed to excel in acting like a great reasoner**. ### **Training Details** - **Hardware:** Single NVIDIA **H100** - **Training Time:** **9 hours (1 epoch)** - **Training Method:** **LoRA fine-tuning (r = 128, alpha = 256)** - **Dataset:** **25k CoT traces** - **Base Model:** `google/gemma-3-4b-it` --- ### **Setup** ```python from transformers import AutoTokenizer, Gemma3ForConditionalGeneration, TextStreamer import torch # Load model and tokenizer model_id = "xsanskarx/thinkygemma-4b" model = Gemma3ForConditionalGeneration.from_pretrained(model_id, device_map="auto").eval() tokenizer = AutoTokenizer.from_pretrained(model_id) def ask_model(prompt: str, max_tokens=8192, temperature=0.7): """ Function to ask a question to the model and stream the response. """ messages = [ {"role": "system", "content": "You are an expert math problem solver, think and reason inside tags, enclose all reasoning in tags, verifying logic step by step and then return your final structured answer"}, {"role": "user", "content": prompt} ] formatted_prompt = tokenizer.apply_chat_template(messages, tokenize=False) inputs = tokenizer(formatted_prompt, return_tensors="pt").to(model.device) streamer = TextStreamer(tokenizer, skip_special_tokens=True) with torch.inference_mode(): model.generate(**inputs, max_new_tokens=max_tokens, do_sample=True, temperature=temperature, streamer=streamer) # Example usage ask_model("do 2+2")