justinj92
/

Qwen2.5-1.5B-Thinking-v1.1

Text Generation

Generated from Trainer

text-generation-inference

Model card Files Files and versions Community

justinj92 commited on Feb 4

Commit

eb34d79

·

verified ·

1 Parent(s): 2db1f37

Update README.md

Files changed (1) hide show

README.md +15 -16

README.md CHANGED Viewed

@@ -9,18 +9,18 @@ tags:
 licence: license
 datasets:
 - microsoft/orca-math-word-problems-200k
-# model-index:
-#   - name: Qwen2.5-1.5B-Thinking-v1.1
-#     results:
-#       - task:
-#           type: text-generation
-#         dataset:
-#           name: openai/gsm8k
-#           type: GradeSchoolMath8K
-#         metrics:
-#           - name: GSM8k (0-Shot)
-#             type: GSM8k (0-Shot)
-#             value: 14.4%
 #           - name: GSM8k (Few-Shot)
 #             type: GSM8k (Few-Shot)
 #             value: 63.31%
@@ -39,13 +39,12 @@ This model is a fine-tuned version of [Qwen/Qwen2.5-1.5B-Instruct](https://huggi
 It has been trained using [TRL](https://github.com/huggingface/trl).
-<!-- ## Evals
 | Model                                    | GSM8k 0-Shot | GSM8k Few-Shot |
 |------------------------------------------|------------------|-------------------|
-| Mistral-7B-v0.1                          | 10             | 41              |
-| Qwen2.5-1.5B-Thinking             | 14.4             | 63.31                 |
- -->
 ## Training procedure

 licence: license
 datasets:
 - microsoft/orca-math-word-problems-200k
+model-index:
+   - name: Qwen2.5-1.5B-Thinking-v1.1
+     results:
+       - task:
+           type: text-generation
+         dataset:
+           name: openai/gsm8k
+           type: GradeSchoolMath8K
+         metrics:
+           - name: GSM8k (0-Shot)
+             type: GSM8k (0-Shot)
+             value: 17%
 #           - name: GSM8k (Few-Shot)
 #             type: GSM8k (Few-Shot)
 #             value: 63.31%
 It has been trained using [TRL](https://github.com/huggingface/trl).
+## Evals
 | Model                                    | GSM8k 0-Shot | GSM8k Few-Shot |
 |------------------------------------------|------------------|-------------------|
+| Mistral-7B-v0.1                          | 10%             | 41%              |
+| Qwen2.5-1.5B-Thinking             | 17%            | N/A                |
 ## Training procedure