Files changed (1) hide show
  1. README.md +69 -58
README.md CHANGED
@@ -1,59 +1,70 @@
1
- ---
2
- license: apache-2.0
3
- license_link: https://huggingface.co/Qwen/Qwen2.5-0.5B-Instruct/blob/main/LICENSE
4
- language:
5
- - tr
6
- - en
7
- datasets:
8
- - erayalp/easy_turkish_math_reasoning
9
- base_model:
10
- - Qwen/Qwen2.5-0.5B-Instruct
11
- pipeline_tag: text-generation
12
- library_name: transformers
13
- tags:
14
- - curriculum-learning
15
- - math
16
- - supervised-fine-tuning
17
- - turkish
18
- ---
19
-
20
- ## Objective
21
- The goal of this project is to enhance the reasoning ability of the compact Qwen2.5-0.5B model on Turkish math questions. Using supervised fine-tuning (SFT) on simpler examples as a starting point, the model will be progressively improved through curriculum learning, and later refined using Group Relative Policy Optimization (GRPO) to boost multi-step reasoning performance.
22
-
23
- #### This model is intended for:
24
- - Research on curriculum learning in small models
25
- - Evaluating Turkish math reasoning tasks
26
-
27
- ### Limitations
28
- - Currently only trained on simpler math examples — lacks robustness for multi-step or abstract reasoning.
29
- - May produce incorrect or overconfident answers on complex tasks.
30
- - Performance may be sensitive to prompt phrasing.
31
-
32
- ### Roadmap
33
- 1. **Phase 1: SFT with basic arithmatic and math problems**
34
- 2. Phase 2: SFT with moderately difficult math problems
35
- 3. Phase 3: SFT with full-scale GSM8K-TR complexity
36
- 4. Phase 4: GRPO-based training to optimize multi-step reasoning and reduce hallucinations
37
-
38
- ## How to Use
39
-
40
- You can easily run inference using the Transformers library:
41
-
42
- ```python
43
- from transformers import AutoTokenizer, AutoModelForCausalLM
44
- import torch
45
-
46
- model_name = "erayalp/qwen2.5-0.5b-instruct-sft-v1-tr-math-easy"
47
- tokenizer = AutoTokenizer.from_pretrained(model_name)
48
- model = AutoModelForCausalLM.from_pretrained(
49
- model_name,
50
- torch_dtype="auto",
51
- device_map="auto"
52
- )
53
-
54
- prompt = "Ali’nin 3 kalemi vardı. 2 kalem daha aldı. Ali’nin şimdi kaç kalemi var?"
55
-
56
- inputs = tokenizer(prompt, return_tensors="pt")
57
- output = model.generate(**inputs, max_new_tokens=256)
58
-
 
 
 
 
 
 
 
 
 
 
 
59
  print(tokenizer.decode(output[0], skip_special_tokens=True))
 
1
+ ---
2
+ license: apache-2.0
3
+ license_link: https://huggingface.co/Qwen/Qwen2.5-0.5B-Instruct/blob/main/LICENSE
4
+ language:
5
+ - zho
6
+ - eng
7
+ - fra
8
+ - spa
9
+ - por
10
+ - deu
11
+ - ita
12
+ - rus
13
+ - jpn
14
+ - kor
15
+ - vie
16
+ - tha
17
+ - ara
18
+ datasets:
19
+ - erayalp/easy_turkish_math_reasoning
20
+ base_model:
21
+ - Qwen/Qwen2.5-0.5B-Instruct
22
+ pipeline_tag: text-generation
23
+ library_name: transformers
24
+ tags:
25
+ - curriculum-learning
26
+ - math
27
+ - supervised-fine-tuning
28
+ - turkish
29
+ ---
30
+
31
+ ## Objective
32
+ The goal of this project is to enhance the reasoning ability of the compact Qwen2.5-0.5B model on Turkish math questions. Using supervised fine-tuning (SFT) on simpler examples as a starting point, the model will be progressively improved through curriculum learning, and later refined using Group Relative Policy Optimization (GRPO) to boost multi-step reasoning performance.
33
+
34
+ #### This model is intended for:
35
+ - Research on curriculum learning in small models
36
+ - Evaluating Turkish math reasoning tasks
37
+
38
+ ### Limitations
39
+ - Currently only trained on simpler math examples — lacks robustness for multi-step or abstract reasoning.
40
+ - May produce incorrect or overconfident answers on complex tasks.
41
+ - Performance may be sensitive to prompt phrasing.
42
+
43
+ ### Roadmap
44
+ 1. **Phase 1: SFT with basic arithmatic and math problems**
45
+ 2. Phase 2: SFT with moderately difficult math problems
46
+ 3. Phase 3: SFT with full-scale GSM8K-TR complexity
47
+ 4. Phase 4: GRPO-based training to optimize multi-step reasoning and reduce hallucinations
48
+
49
+ ## How to Use
50
+
51
+ You can easily run inference using the Transformers library:
52
+
53
+ ```python
54
+ from transformers import AutoTokenizer, AutoModelForCausalLM
55
+ import torch
56
+
57
+ model_name = "erayalp/qwen2.5-0.5b-instruct-sft-v1-tr-math-easy"
58
+ tokenizer = AutoTokenizer.from_pretrained(model_name)
59
+ model = AutoModelForCausalLM.from_pretrained(
60
+ model_name,
61
+ torch_dtype="auto",
62
+ device_map="auto"
63
+ )
64
+
65
+ prompt = "Ali’nin 3 kalemi vardı. 2 kalem daha aldı. Ali’nin şimdi kaç kalemi var?"
66
+
67
+ inputs = tokenizer(prompt, return_tensors="pt")
68
+ output = model.generate(**inputs, max_new_tokens=256)
69
+
70
  print(tokenizer.decode(output[0], skip_special_tokens=True))