File size: 3,573 Bytes
ee5eb1e
 
f6f66c9
 
 
 
dd0be16
93f7b20
f6f66c9
 
 
 
 
 
ee5eb1e
d85556f
ee5eb1e
f6f66c9
ee5eb1e
f6f66c9
ee5eb1e
f6f66c9
ee5eb1e
f6f66c9
 
ee5eb1e
f6f66c9
 
ee5eb1e
f6f66c9
 
ee5eb1e
f6f66c9
 
ee5eb1e
f6f66c9
ee5eb1e
f6f66c9
 
ee5eb1e
f6f66c9
ee5eb1e
f6f66c9
 
 
 
 
 
ee5eb1e
f6f66c9
 
 
 
 
 
 
 
 
 
 
ee5eb1e
f6f66c9
 
 
 
 
 
 
ee5eb1e
f6f66c9
 
ee5eb1e
f6f66c9
ee5eb1e
f6f66c9
 
 
 
ee5eb1e
f6f66c9
ee5eb1e
f6f66c9
 
ee5eb1e
f6f66c9
 
ee5eb1e
f6f66c9
 
ee5eb1e
f6f66c9
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
---
library_name: transformers
tags:
- text-generation-inference
- code
- math
- R1
- distill
license: apache-2.0
language:
- en
base_model:
- Qwen/Qwen2.5-1.5B-Instruct
pipeline_tag: text-generation
---
![PPP.png](https://cdn-uploads.huggingface.co/production/uploads/65bb837dbfb878f46c77de4c/7xnCRDtZ0T3-oWnCrw-Rf.png)

# **Castula-U2-QwenRe-1.5B**

> **Castula-U2-QwenRe-1.5B** is a **compact, multilingual reasoning model** fine-tuned from **Qwen-1.5B**, excelling in **mathematical problem solving**, **logical reasoning**, **code generation**, and **general-purpose tasks**. Its step-by-step reasoning and bilingual fluency make it ideal for educational systems, coding assistants, and lightweight reasoning applications.

## **Key Features**

1. **Advanced Step-by-Step Reasoning**  
   Fine-tuned to produce intermediate steps for math, logic, and code problems, offering transparency and interpretability crucial for education, coding help, and diagnostics.

2. **Multilingual Proficiency (English + Chinese)**  
   Understands and solves problems in **both English and Simplified Chinese**, making it accessible in diverse learning and working environments.

3. **Compact Yet Versatile (1.5B Parameters)**  
   Small enough for **low-resource environments**, yet capable of **math**, **logical puzzles**, **basic coding tasks**, and general comprehension, balancing performance and efficiency.

4. **Structured Computation & Problem Solving**  
   Mirrors human-like multi-step problem-solving, making solutions easy to follow, debug, or verify.

## **Quickstart with Transformers**

```python
from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "prithivMLmods/Castula-U2-QwenRe-1.5B"

model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype="auto",
    device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained(model_name)

prompt = "Solve: A train travels 180 km in 3 hours. What is its average speed?"
messages = [
    {"role": "system", "content": "You are a helpful tutor skilled in solving math, logic, and code problems with step-by-step explanations."},
    {"role": "user", "content": prompt}
]
text = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True
)
model_inputs = tokenizer([text], return_tensors="pt").to(model.device)

generated_ids = model.generate(
    **model_inputs,
    max_new_tokens=512
)
generated_ids = [
    output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
]

response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
```

## **Intended Use**

- **Math & Logic Tutoring**: Solves problems with explanations ideal for students and educators.
- **Code Assistant**: Helps with beginner-to-intermediate code generation and understanding.
- **Bilingual Apps**: Educational tools in **English** and **Chinese** for a global audience.
- **Lightweight Reasoning Systems**: Deployable in **mobile apps**, **browser extensions**, and **edge devices**.

## **Limitations**

1. **Domain Specialization**:  
   Best in math, logic, and code. Performance may degrade in highly creative or abstract language tasks.

2. **Compact Scale**:  
   While efficient, may underperform larger models in deeply complex reasoning or long-context tasks.

3. **Inherited Bias**:  
   May reflect biases from the base model (Qwen-1.5B); outputs should be verified for sensitive or critical uses.

4. **Prompt Sensitivity**:  
   Structured, clearly stated inputs produce significantly better outputs.