Update README.md
Browse files
README.md
CHANGED
@@ -1,7 +1,14 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
# Llama 3.1-8B Mathematical Reasoning (GRPO)
|
2 |
|
3 |
## Full-Post-Training Instruction
|
4 |
-
Please visit our [notebook](https://colab.research.google.com/drive/1kRmxAC5dL_rOqZUea11X2IdE5-mKbhnw?usp=sharing) for a full walkthrough on this project
|
5 |
|
6 |
## Model Description
|
7 |
|
|
|
1 |
+
---
|
2 |
+
datasets:
|
3 |
+
- openai/gsm8k
|
4 |
+
base_model:
|
5 |
+
- meta-llama/Llama-3.1-8B-Instruct
|
6 |
+
pipeline_tag: reinforcement-learning
|
7 |
+
---
|
8 |
# Llama 3.1-8B Mathematical Reasoning (GRPO)
|
9 |
|
10 |
## Full-Post-Training Instruction
|
11 |
+
Please visit our [notebook](https://colab.research.google.com/drive/1kRmxAC5dL_rOqZUea11X2IdE5-mKbhnw?usp=sharing) for a full walkthrough on this project.
|
12 |
|
13 |
## Model Description
|
14 |
|