File size: 2,162 Bytes
b9631e8 7bf6ae0 b9631e8 0317935 b25b379 b9631e8 0317935 b25b379 0317935 b25b379 0317935 b25b379 0317935 b25b379 0317935 b25b379 0317935 b25b379 470d48d b25b379 0317935 b25b379 0317935 b25b379 0317935 b25b379 0317935 b25b379 b9631e8 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 |
---
base_model:
- google/gemma-3-4b-pt
pipeline_tag: text-generation
library_name: transformers
---
# **thinkygemma-4b: your average fake reasoner**
Fine-tuned from **Gemma-3-4b-pt**
π **Model ID:** `xsanskarx/thinkygemma-4b`
π **Parameters trained:** **1.8 billion**
π **Trained on:** **25k rows of verified Chain-of-Thought (CoT) traces** from **DeepSeek R1** and **Qwen QWQ**
π **Next planned step:** **GRPO**
π **adapters repo:** `xsanskarx/thinkgemma-4b`
---
## **Model Description**
This is a **fine-tuned version of Google's Gemma-3-4b-it**, adapted for **structured reasoning / fake induced reasoning **. It is designed to excel in acting like a great reasoner**.
### **Training Details**
- **Hardware:** Single NVIDIA **H100**
- **Training Time:** **9 hours (1 epoch)**
- **Training Method:** **LoRA fine-tuning (r = 128, alpha = 256)**
- **Dataset:** **25k CoT traces**
- **Base Model:** `google/gemma-3-4b-it`
---
### **Setup**
```python
from transformers import AutoTokenizer, Gemma3ForConditionalGeneration, TextStreamer
import torch
# Load model and tokenizer
model_id = "xsanskarx/thinkygemma-4b"
model = Gemma3ForConditionalGeneration.from_pretrained(model_id, device_map="auto").eval()
tokenizer = AutoTokenizer.from_pretrained(model_id)
def ask_model(prompt: str, max_tokens=8192, temperature=0.7):
"""
Function to ask a question to the model and stream the response.
"""
messages = [
{"role": "system", "content": "You are an expert math problem solver, think and reason inside <think> tags, enclose all reasoning in <think> tags, verifying logic step by step and then return your final structured answer"},
{"role": "user", "content": prompt}
]
formatted_prompt = tokenizer.apply_chat_template(messages, tokenize=False)
inputs = tokenizer(formatted_prompt, return_tensors="pt").to(model.device)
streamer = TextStreamer(tokenizer, skip_special_tokens=True)
with torch.inference_mode():
model.generate(**inputs, max_new_tokens=max_tokens, do_sample=True, temperature=temperature, streamer=streamer)
# Example usage
ask_model("do 2+2") |