AmirMohseni commited on
Commit
d361f2a
·
verified ·
1 Parent(s): 679674c

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +41 -28
README.md CHANGED
@@ -6,54 +6,67 @@ tags:
6
  - generated_from_trainer
7
  - trl
8
  - sft
 
9
  licence: license
10
  datasets:
11
  - openai/gsm8k
12
  ---
13
 
14
- # Model Card for Velma-9b
15
 
16
- This model is a fine-tuned version of [google/gemma-2-9b-it](https://huggingface.co/google/gemma-2-9b-it).
17
- It has been trained using [TRL](https://github.com/huggingface/trl).
18
 
19
- ## Quick start
 
 
 
 
 
 
 
 
 
 
 
 
 
20
 
21
  ```python
22
  from transformers import pipeline
23
 
24
- question = "If you had a time machine, but could only go to the past or the future once and never return, which would you choose and why?"
25
  generator = pipeline("text-generation", model="AmirMohseni/Velma-9b", device="cuda")
 
 
 
 
 
26
  output = generator([{"role": "user", "content": question}], max_new_tokens=128, return_full_text=False)[0]
27
  print(output["generated_text"])
28
- ```
29
-
30
- ## Training procedure
31
 
 
32
 
 
33
 
34
- This model was trained with SFT.
35
 
36
- ### Framework versions
 
 
 
37
 
38
- - TRL: 0.12.1
39
- - Transformers: 4.46.3
40
- - Pytorch: 2.1.1
41
- - Datasets: 3.1.0
42
- - Tokenizers: 0.20.3
43
 
44
- ## Citations
 
 
 
 
45
 
 
46
 
 
47
 
48
- Cite TRL as:
49
-
50
- ```bibtex
51
- @misc{vonwerra2022trl,
52
- title = {{TRL: Transformer Reinforcement Learning}},
53
- author = {Leandro von Werra and Younes Belkada and Lewis Tunstall and Edward Beeching and Tristan Thrush and Nathan Lambert and Shengyi Huang and Kashif Rasul and Quentin Gallouédec},
54
- year = 2020,
55
- journal = {GitHub repository},
56
- publisher = {GitHub},
57
- howpublished = {\url{https://github.com/huggingface/trl}}
58
- }
59
- ```
 
6
  - generated_from_trainer
7
  - trl
8
  - sft
9
+ - reasoning
10
  licence: license
11
  datasets:
12
  - openai/gsm8k
13
  ---
14
 
15
+ # Velma-9b
16
 
17
+ ## Model Overview
 
18
 
19
+ **Velma-9b** is a fine-tuned version of [google/gemma-2-9b-it](https://huggingface.co/google/gemma-2-9b-it), optimized to improve reasoning capabilities. The model has been trained using the [GSM8K dataset](https://huggingface.co/datasets/openai/gsm8k), a benchmark dataset designed to enhance mathematical and logical reasoning skills in language models.
20
+
21
+ This fine-tuning process allows Velma-9b to excel at structured problem-solving, step-by-step reasoning, and logical inference, making it an ideal choice for tasks requiring in-depth analytical thinking.
22
+
23
+ ## Features
24
+
25
+ - **Fine-Tuned on GSM8K:** Enhanced for mathematical reasoning and step-by-step logical problem-solving.
26
+ - **Transformer-Based:** Built on the powerful `gemma-2-9b-it` architecture.
27
+ - **Optimized for SFT (Supervised Fine-Tuning):** Fine-tuned using [TRL (Transformer Reinforcement Learning)](https://github.com/huggingface/trl) for improved inference and structured output generation.
28
+ - **Efficient Deployment:** Compatible with `transformers` and supports GPU acceleration for fast inference.
29
+
30
+ ## Quick Start
31
+
32
+ Use the following code to generate text using Velma-9b:
33
 
34
  ```python
35
  from transformers import pipeline
36
 
37
+ # Initialize the pipeline
38
  generator = pipeline("text-generation", model="AmirMohseni/Velma-9b", device="cuda")
39
+
40
+ # Example prompt
41
+ question = "If you had a time machine but could only go to the past or the future once and never return, which would you choose and why?"
42
+
43
+ # Generate output
44
  output = generator([{"role": "user", "content": question}], max_new_tokens=128, return_full_text=False)[0]
45
  print(output["generated_text"])
 
 
 
46
 
47
+ ## Training Procedure
48
 
49
+ Velma-9b was fine-tuned using the **Supervised Fine-Tuning (SFT)** approach with the GSM8K dataset. This dataset contains high-quality mathematical and logical reasoning problems that help models develop structured thinking and problem-solving skills.
50
 
51
+ ### Training Details
52
 
53
+ - **Base Model:** [google/gemma-2-9b-it](https://huggingface.co/google/gemma-2-9b-it)
54
+ - **Dataset Used:** [GSM8K](https://huggingface.co/datasets/openai/gsm8k)
55
+ - **Fine-Tuning Method:** SFT(STaR) using [TRL](https://github.com/huggingface/trl)
56
+ - **Optimization Objective:** Supervised fine-tuning to enhance structured reasoning
57
 
58
+ ### Framework Versions
 
 
 
 
59
 
60
+ - **TRL:** `0.12.1`
61
+ - **Transformers:** `4.46.3`
62
+ - **PyTorch:** `2.1.1`
63
+ - **Datasets:** `3.1.0`
64
+ - **Tokenizers:** `0.20.3`
65
 
66
+ ## Use Cases
67
 
68
+ Velma-9b is best suited for tasks requiring structured reasoning and logical inference:
69
 
70
+ - **Mathematical & Logical Reasoning Tasks:** Providing step-by-step explanations and structured problem-solving.
71
+ - **Education & Tutoring Applications:** Assisting students with detailed, logic-driven answers.
72
+ - **AI Research & Experimentation:** Evaluating fine-tuning strategies for reasoning-focused language models.