kenhktsui commited on
Commit
c278b08
·
verified ·
1 Parent(s): 6b4eab2

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +145 -1
README.md CHANGED
@@ -9,7 +9,151 @@ tags:
9
  license: apache-2.0
10
  language:
11
  - en
 
 
12
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
13
 
14
  # Uploaded model
15
 
@@ -19,4 +163,4 @@ language:
19
 
20
  This qwen2 model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.
21
 
22
- [<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)
 
9
  license: apache-2.0
10
  language:
11
  - en
12
+ datasets:
13
+ - kenhktsui/longtalk-cot-v0.1
14
  ---
15
+ # kenhktsui/qwen2.5-7b-instruct-thinking-sft-merged-gguf
16
+
17
+ With 1 epoch of SFT, it boosts processbench GSM-8k subset by 26%. However, it results in worse result in GSM-8k.
18
+ It seems to be either 1) the metrics are not measuring correctly due to parsing or 2) sign of overfitting answer, as it is not able to critque reasoning step.
19
+
20
+ ||[kenhktsui/qwen2.5-7b-instruct-thinking-sft-merged](https://huggingface.co/kenhktsui/qwen2.5-7b-instruct-thinking-sft-merged)|[Qwen/Qwen2.5-7B-Instruct](https://huggingface.co/Qwen/Qwen2.5-7B-Instruct)|
21
+ |---|---|---|
22
+ |ProcessBench (GSM-8k)|**46.0**|36.5 (reported)|
23
+ |GSM-8k|78.54|**84.23**|
24
+
25
+ This needs to be investigated further.
26
+ I have done inspection of how **model behavior changed** after SFT. As expected, the finetuned model thinks much longer, and often solve a problem with multiple approaches instead of one.
27
+
28
+ ### Quick Test 1: Is 9.11 greater than 9.9?
29
+ <details>
30
+ <summary> response from kenhktsui/qwen2.5-7b-instruct-thinking-sft-merged-gguf </summary>
31
+ No, 9.11 is not greater than 9.9. Let's think about this step by step.
32
+
33
+ First, it helps to understand that both numbers are decimal numbers, specifically they have two digits after the decimal point. So, 9.11 and 9.9 can be compared directly based on their
34
+ values in each place value: whole number part and the tenths and hundredths parts.
35
+
36
+ Let's break them down:
37
+
38
+ - For 9.11:
39
+ - The whole number part is 9.
40
+ - The tenths place is 1 (which means one tenth).
41
+ - The hundredths place is also 1 (one hundredth).
42
+
43
+ - For 9.9:
44
+ - The whole number part is also 9.
45
+ - But here, the tenths place is 9 (nine tenths), and there's no hundredths digit after that; it's considered to have zero in the hundredths place because decimal numbers with fewer digits
46
+ are filled with zeros.
47
+
48
+ So, comparing them:
49
+
50
+ - Both have the same whole number part: 9.
51
+ - For the tenth place:
52
+ - 1 vs. 9.
53
+
54
+ Since 9 is greater than 1, this already tells us that 9.9 is larger than 9.11. But to be thorough, let's look at the hundredths places as well:
55
+
56
+ - In 9.11: hundredths = 1
57
+ - In 9.9: hundredths = 0
58
+
59
+ Even though there are no digits in the hundredths place for 9.9, it's still considered zero because of the way decimals work.
60
+
61
+ Therefore, overall, 9.11 < 9.9.
62
+
63
+ Wait a minute, but I should make sure that I'm not missing anything here. Sometimes, people might think about adding or subtracting numbers differently, but in this case, since both numbers
64
+ have the same whole number part and one has a larger tenths digit, it's clear which is larger.
65
+
66
+ Another way to look at it is to consider them as percentages:
67
+
68
+ - 9.11 = 911%
69
+ - 9.9 = 990%
70
+
71
+ But that might be confusing; in percentage terms, 990% is still greater than 911%.
72
+
73
+ Alternatively, if we multiply both numbers by 100 to eliminate the decimal point:
74
+
75
+ - 9.11 * 100 = 911
76
+ - 9.9 * 100 = 990
77
+
78
+ Again, 990 is greater than 911.
79
+
80
+ So, no matter how we look at it, 9.9 is larger than 9.11.
81
+
82
+ I think that covers the comparison thoroughly. There doesn't seem to be any trick or special case here; it's a straightforward decimal comparison once you understand place values.
83
+
84
+ **Final Answer**
85
+
86
+ \[ \boxed{9.11 < 9.9} \]
87
+ </details>
88
+
89
+ <details>
90
+ <summary> response from response from Qwen/Qwen2.5-7B-Instruct-GGUF </summary>
91
+ No, 9.11 is not greater than 9.9. When comparing decimal numbers, you can compare them digit by digit from left to right. In this case:
92
+
93
+ - The whole number part of both numbers is the same (9).
94
+ - Comparing the tenths place, we see that 1 in 9.11 is larger than 9 in 9.9.
95
+ - Therefore, 9.11 > 9.9.
96
+
97
+ So, 9.11 is greater than 9.9.
98
+ </details>
99
+
100
+
101
+ ### Quick Test 2: How many letters R in the word strawberry?
102
+ <details>
103
+ <summary> response from kenhktsui/qwen2.5-7b-instruct-thinking-sft-merged-gguf </summary>
104
+ Let's see. The word is "strawberry." I need to count how many 'R' letters are there in this word.
105
+
106
+ First, let me write down the word again: s-t-r-a-w-b-e-r-r-y.
107
+
108
+ Now, I'll go through each letter and check if it's an 'R'.
109
+
110
+ Starting with the first letter: S - that's not R.
111
+
112
+ Second letter: T - also not R.
113
+
114
+ Third letter: R - yes, that's an 'R'. So, one so far.
115
+
116
+ Fourth letter: A - nope.
117
+
118
+ Fifth letter: W - still not R.
119
+
120
+ Sixth letter: B - not R either.
121
+
122
+ Seventh letter: E - no.
123
+
124
+ Eighth letter: R - another 'R'! That makes two now.
125
+
126
+ Ninth letter: B - no R here.
127
+
128
+ Tenth letter: E - no R.
129
+
130
+ Eleventh letter: R - yet again, an 'R'. Now it's three.
131
+
132
+ Twelfth letter: Y - not R.
133
+
134
+ So, counting all the way through, I've found three 'R' letters in "strawberry."
135
+
136
+ Wait a minute, let me double-check. Maybe there are more 'R's that I missed.
137
+
138
+ Looking at "s-t-r-a-w-b-e-r-r-y." Yes, I did find three: 3rd, 8th, and 11th positions.
139
+
140
+ I think that's it. There don't seem to be any other 'R's in the word.
141
+
142
+ Therefore, there are 3 letters R in the word "strawberry."
143
+
144
+ **Final Answer**
145
+
146
+ \[ \boxed{3} \]
147
+ </details>
148
+ <details>
149
+ <summary> response from Qwen/Qwen2.5-7B-Instruct-GGUF </summary>
150
+ The letter "R" appears 1 time in the word "strawberry".
151
+ </details>
152
+
153
+ ### Try out in Ollama!
154
+ ```
155
+ ollama run hf.co/kenhktsui/qwen2.5-7b-instruct-thinking-sft-merged-gguf
156
+ ```
157
 
158
  # Uploaded model
159
 
 
163
 
164
  This qwen2 model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.
165
 
166
+ [<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)