Update README.md
Browse files
README.md
CHANGED
@@ -6,54 +6,67 @@ tags:
|
|
6 |
- generated_from_trainer
|
7 |
- trl
|
8 |
- sft
|
|
|
9 |
licence: license
|
10 |
datasets:
|
11 |
- openai/gsm8k
|
12 |
---
|
13 |
|
14 |
-
#
|
15 |
|
16 |
-
|
17 |
-
It has been trained using [TRL](https://github.com/huggingface/trl).
|
18 |
|
19 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
20 |
|
21 |
```python
|
22 |
from transformers import pipeline
|
23 |
|
24 |
-
|
25 |
generator = pipeline("text-generation", model="AmirMohseni/Velma-9b", device="cuda")
|
|
|
|
|
|
|
|
|
|
|
26 |
output = generator([{"role": "user", "content": question}], max_new_tokens=128, return_full_text=False)[0]
|
27 |
print(output["generated_text"])
|
28 |
-
```
|
29 |
-
|
30 |
-
## Training procedure
|
31 |
|
|
|
32 |
|
|
|
33 |
|
34 |
-
|
35 |
|
36 |
-
|
|
|
|
|
|
|
37 |
|
38 |
-
|
39 |
-
- Transformers: 4.46.3
|
40 |
-
- Pytorch: 2.1.1
|
41 |
-
- Datasets: 3.1.0
|
42 |
-
- Tokenizers: 0.20.3
|
43 |
|
44 |
-
|
|
|
|
|
|
|
|
|
45 |
|
|
|
46 |
|
|
|
47 |
|
48 |
-
|
49 |
-
|
50 |
-
|
51 |
-
@misc{vonwerra2022trl,
|
52 |
-
title = {{TRL: Transformer Reinforcement Learning}},
|
53 |
-
author = {Leandro von Werra and Younes Belkada and Lewis Tunstall and Edward Beeching and Tristan Thrush and Nathan Lambert and Shengyi Huang and Kashif Rasul and Quentin Gallouédec},
|
54 |
-
year = 2020,
|
55 |
-
journal = {GitHub repository},
|
56 |
-
publisher = {GitHub},
|
57 |
-
howpublished = {\url{https://github.com/huggingface/trl}}
|
58 |
-
}
|
59 |
-
```
|
|
|
6 |
- generated_from_trainer
|
7 |
- trl
|
8 |
- sft
|
9 |
+
- reasoning
|
10 |
licence: license
|
11 |
datasets:
|
12 |
- openai/gsm8k
|
13 |
---
|
14 |
|
15 |
+
# Velma-9b
|
16 |
|
17 |
+
## Model Overview
|
|
|
18 |
|
19 |
+
**Velma-9b** is a fine-tuned version of [google/gemma-2-9b-it](https://huggingface.co/google/gemma-2-9b-it), optimized to improve reasoning capabilities. The model has been trained using the [GSM8K dataset](https://huggingface.co/datasets/openai/gsm8k), a benchmark dataset designed to enhance mathematical and logical reasoning skills in language models.
|
20 |
+
|
21 |
+
This fine-tuning process allows Velma-9b to excel at structured problem-solving, step-by-step reasoning, and logical inference, making it an ideal choice for tasks requiring in-depth analytical thinking.
|
22 |
+
|
23 |
+
## Features
|
24 |
+
|
25 |
+
- **Fine-Tuned on GSM8K:** Enhanced for mathematical reasoning and step-by-step logical problem-solving.
|
26 |
+
- **Transformer-Based:** Built on the powerful `gemma-2-9b-it` architecture.
|
27 |
+
- **Optimized for SFT (Supervised Fine-Tuning):** Fine-tuned using [TRL (Transformer Reinforcement Learning)](https://github.com/huggingface/trl) for improved inference and structured output generation.
|
28 |
+
- **Efficient Deployment:** Compatible with `transformers` and supports GPU acceleration for fast inference.
|
29 |
+
|
30 |
+
## Quick Start
|
31 |
+
|
32 |
+
Use the following code to generate text using Velma-9b:
|
33 |
|
34 |
```python
|
35 |
from transformers import pipeline
|
36 |
|
37 |
+
# Initialize the pipeline
|
38 |
generator = pipeline("text-generation", model="AmirMohseni/Velma-9b", device="cuda")
|
39 |
+
|
40 |
+
# Example prompt
|
41 |
+
question = "If you had a time machine but could only go to the past or the future once and never return, which would you choose and why?"
|
42 |
+
|
43 |
+
# Generate output
|
44 |
output = generator([{"role": "user", "content": question}], max_new_tokens=128, return_full_text=False)[0]
|
45 |
print(output["generated_text"])
|
|
|
|
|
|
|
46 |
|
47 |
+
## Training Procedure
|
48 |
|
49 |
+
Velma-9b was fine-tuned using the **Supervised Fine-Tuning (SFT)** approach with the GSM8K dataset. This dataset contains high-quality mathematical and logical reasoning problems that help models develop structured thinking and problem-solving skills.
|
50 |
|
51 |
+
### Training Details
|
52 |
|
53 |
+
- **Base Model:** [google/gemma-2-9b-it](https://huggingface.co/google/gemma-2-9b-it)
|
54 |
+
- **Dataset Used:** [GSM8K](https://huggingface.co/datasets/openai/gsm8k)
|
55 |
+
- **Fine-Tuning Method:** SFT(STaR) using [TRL](https://github.com/huggingface/trl)
|
56 |
+
- **Optimization Objective:** Supervised fine-tuning to enhance structured reasoning
|
57 |
|
58 |
+
### Framework Versions
|
|
|
|
|
|
|
|
|
59 |
|
60 |
+
- **TRL:** `0.12.1`
|
61 |
+
- **Transformers:** `4.46.3`
|
62 |
+
- **PyTorch:** `2.1.1`
|
63 |
+
- **Datasets:** `3.1.0`
|
64 |
+
- **Tokenizers:** `0.20.3`
|
65 |
|
66 |
+
## Use Cases
|
67 |
|
68 |
+
Velma-9b is best suited for tasks requiring structured reasoning and logical inference:
|
69 |
|
70 |
+
- **Mathematical & Logical Reasoning Tasks:** Providing step-by-step explanations and structured problem-solving.
|
71 |
+
- **Education & Tutoring Applications:** Assisting students with detailed, logic-driven answers.
|
72 |
+
- **AI Research & Experimentation:** Evaluating fine-tuning strategies for reasoning-focused language models.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|