Update README.md
Browse files
README.md
CHANGED
@@ -1,10 +1,10 @@
|
|
1 |
---
|
2 |
license: mit
|
3 |
-
library_name:
|
4 |
language:
|
5 |
- en
|
6 |
base_model:
|
7 |
-
-
|
8 |
datasets:
|
9 |
- AI-MO/NuminaMath-CoT
|
10 |
- codeparrot/apps
|
@@ -13,6 +13,23 @@ datasets:
|
|
13 |
- MatrixStudio/Codeforces-Python-Submissions
|
14 |
pipeline_tag: text-generation
|
15 |
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
16 |
# ZR1-1.5B
|
17 |
|
18 |
ZR1-1.5B is a small reasoning model trained extensively on both verified coding and mathematics problems with reinforcement learning. The model outperforms Llama-3.1-70B-Instruct on hard coding tasks and improves upon the base R1-Distill-1.5B model by over 50%, while achieving strong scores on math evaluations and a 37.91% pass@1 accuracy on GPQA-Diamond with just 1.5B parameters.
|
|
|
1 |
---
|
2 |
license: mit
|
3 |
+
library_name: exllamav2
|
4 |
language:
|
5 |
- en
|
6 |
base_model:
|
7 |
+
- Zyphra/ZR1-1.5B
|
8 |
datasets:
|
9 |
- AI-MO/NuminaMath-CoT
|
10 |
- codeparrot/apps
|
|
|
13 |
- MatrixStudio/Codeforces-Python-Submissions
|
14 |
pipeline_tag: text-generation
|
15 |
---
|
16 |
+
# ZR1-1.5B-exl2
|
17 |
+
Original model: [ZR1-1.5B](https://huggingface.co/Zyphra/ZR1-1.5B) by [Zyphra](https://huggingface.co/Zyphra)
|
18 |
+
Based on: [DeepSeek-R1-Distill-Qwen-1.5B](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B) by [DeepSeek](https://huggingface.co/deepseek-ai)
|
19 |
+
Foundation model: [Qwen2.5-1.5B](https://huggingface.co/Qwen/Qwen2.5-1.5B) by [Qwen](https://huggingface.co/Qwen)
|
20 |
+
|
21 |
+
## Quants
|
22 |
+
[4bpw h6 (main)](https://huggingface.co/cgus/ZR1-1.5B-exl2/tree/main)
|
23 |
+
[4.5bpw h6](https://huggingface.co/cgus/ZR1-1.5B-exl2/tree/4.5bpw-h6)
|
24 |
+
[5bpw h6](https://huggingface.co/cgus/ZR1-1.5B-exl2/tree/5bpw-h6)
|
25 |
+
[6bpw h6](https://huggingface.co/cgus/ZR1-1.5B-exl2/tree/6bpw-h6)
|
26 |
+
[8bpw h8](https://huggingface.co/cgus/ZR1-1.5B-exl2/tree/8bpw-h8)
|
27 |
+
|
28 |
+
## Quantization notes
|
29 |
+
Made with Exllamav2 0.2.8 with default dataset.
|
30 |
+
This model can be used with TabbyAPI or Text-Generation-WebUI with RTX GPU on Windows or RTX/ROCm on Linux.
|
31 |
+
|
32 |
+
# Original model card
|
33 |
# ZR1-1.5B
|
34 |
|
35 |
ZR1-1.5B is a small reasoning model trained extensively on both verified coding and mathematics problems with reinforcement learning. The model outperforms Llama-3.1-70B-Instruct on hard coding tasks and improves upon the base R1-Distill-1.5B model by over 50%, while achieving strong scores on math evaluations and a 37.91% pass@1 accuracy on GPQA-Diamond with just 1.5B parameters.
|