Update README.md
Browse files
README.md
CHANGED
@@ -7,4 +7,13 @@ language:
|
|
7 |
base_model:
|
8 |
- meta-llama/Llama-3.2-3B-Instruct
|
9 |
pipeline_tag: text-generation
|
10 |
-
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
7 |
base_model:
|
8 |
- meta-llama/Llama-3.2-3B-Instruct
|
9 |
pipeline_tag: text-generation
|
10 |
+
---
|
11 |
+
Barcenas 3b GRPO
|
12 |
+
|
13 |
+
Based on alpindale/Llama-3.2-3B-Instruct
|
14 |
+
And trained with dataset openai/gsm8k
|
15 |
+
|
16 |
+
The objective of this model is to test the novel GRPO training used in DeepSeek R1.
|
17 |
+
Using the reinforcement learning (RL) algorithm to improve the reasoning capabilities of the Llama-3.2-3B-Instruct.
|
18 |
+
|
19 |
+
Made with ❤️ in Guadalupe, Nuevo Leon, Mexico 🇲🇽
|