|
--- |
|
base_model: |
|
- google/gemma-3-4b-pt |
|
pipeline_tag: text-generation |
|
library_name: transformers |
|
--- |
|
# **thinkygemma-4b: your average fake reasoner** |
|
Fine-tuned from **Gemma-3-4b-pt** |
|
|
|
π **Model ID:** `xsanskarx/thinkygemma-4b` |
|
π **Parameters trained:** **1.8 billion** |
|
π **Trained on:** **25k rows of verified Chain-of-Thought (CoT) traces** from **DeepSeek R1** and **Qwen QWQ** |
|
π **Next planned step:** **GRPO** |
|
π **adapters repo:** `xsanskarx/thinkgemma-4b` |
|
|
|
|
|
--- |
|
|
|
## **Model Description** |
|
This is a **fine-tuned version of Google's Gemma-3-4b-it**, adapted for **structured reasoning / fake induced reasoning **. It is designed to excel in acting like a great reasoner**. |
|
|
|
### **Training Details** |
|
- **Hardware:** Single NVIDIA **H100** |
|
- **Training Time:** **9 hours (1 epoch)** |
|
- **Training Method:** **LoRA fine-tuning (r = 128, alpha = 256)** |
|
- **Dataset:** **25k CoT traces** |
|
- **Base Model:** `google/gemma-3-4b-it` |
|
|
|
--- |
|
|
|
|
|
|
|
### **Setup** |
|
```python |
|
from transformers import AutoTokenizer, Gemma3ForConditionalGeneration, TextStreamer |
|
import torch |
|
|
|
# Load model and tokenizer |
|
model_id = "xsanskarx/thinkygemma-4b" |
|
model = Gemma3ForConditionalGeneration.from_pretrained(model_id, device_map="auto").eval() |
|
tokenizer = AutoTokenizer.from_pretrained(model_id) |
|
|
|
def ask_model(prompt: str, max_tokens=8192, temperature=0.7): |
|
""" |
|
Function to ask a question to the model and stream the response. |
|
""" |
|
messages = [ |
|
{"role": "system", "content": "You are an expert math problem solver, think and reason inside <think> tags, enclose all reasoning in <think> tags, verifying logic step by step and then return your final structured answer"}, |
|
{"role": "user", "content": prompt} |
|
] |
|
|
|
formatted_prompt = tokenizer.apply_chat_template(messages, tokenize=False) |
|
inputs = tokenizer(formatted_prompt, return_tensors="pt").to(model.device) |
|
|
|
streamer = TextStreamer(tokenizer, skip_special_tokens=True) |
|
with torch.inference_mode(): |
|
model.generate(**inputs, max_new_tokens=max_tokens, do_sample=True, temperature=temperature, streamer=streamer) |
|
|
|
# Example usage |
|
ask_model("do 2+2") |