Danielbrdz commited on
Commit
643e761
·
verified ·
1 Parent(s): a1c5481

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +10 -1
README.md CHANGED
@@ -7,4 +7,13 @@ language:
7
  base_model:
8
  - meta-llama/Llama-3.2-3B-Instruct
9
  pipeline_tag: text-generation
10
- ---
 
 
 
 
 
 
 
 
 
 
7
  base_model:
8
  - meta-llama/Llama-3.2-3B-Instruct
9
  pipeline_tag: text-generation
10
+ ---
11
+ Barcenas 3b GRPO
12
+
13
+ Based on alpindale/Llama-3.2-3B-Instruct
14
+ And trained with dataset openai/gsm8k
15
+
16
+ The objective of this model is to test the novel GRPO training used in DeepSeek R1.
17
+ Using the reinforcement learning (RL) algorithm to improve the reasoning capabilities of the Llama-3.2-3B-Instruct.
18
+
19
+ Made with ❤️ in Guadalupe, Nuevo Leon, Mexico 🇲🇽