AmirMohseni
/

Velma-9b

@@ -6,54 +6,67 @@ tags:
 - generated_from_trainer
 - trl
 - sft
 licence: license
 datasets:
 - openai/gsm8k
 ---
-# Model Card for Velma-9b
-This model is a fine-tuned version of [google/gemma-2-9b-it](https://huggingface.co/google/gemma-2-9b-it).
-It has been trained using [TRL](https://github.com/huggingface/trl).
-## Quick start
 ```python
 from transformers import pipeline
-question = "If you had a time machine, but could only go to the past or the future once and never return, which would you choose and why?"
 generator = pipeline("text-generation", model="AmirMohseni/Velma-9b", device="cuda")
 output = generator([{"role": "user", "content": question}], max_new_tokens=128, return_full_text=False)[0]
 print(output["generated_text"])
-```
-## Training procedure
-This model was trained with SFT.
-### Framework versions
-- TRL: 0.12.1
-- Transformers: 4.46.3
-- Pytorch: 2.1.1
-- Datasets: 3.1.0
-- Tokenizers: 0.20.3
-## Citations
-Cite TRL as:
-```bibtex
-@misc{vonwerra2022trl,
-	title        = {{TRL: Transformer Reinforcement Learning}},
-	author       = {Leandro von Werra and Younes Belkada and Lewis Tunstall and Edward Beeching and Tristan Thrush and Nathan Lambert and Shengyi Huang and Kashif Rasul and Quentin Gallouédec},
-	year         = 2020,
-	journal      = {GitHub repository},
-	publisher    = {GitHub},
-	howpublished = {\url{https://github.com/huggingface/trl}}
-}
-```

 - generated_from_trainer
 - trl
 - sft
+- reasoning
 licence: license
 datasets:
 - openai/gsm8k
 ---
+# Velma-9b
+## Model Overview
+**Velma-9b** is a fine-tuned version of [google/gemma-2-9b-it](https://huggingface.co/google/gemma-2-9b-it), optimized to improve reasoning capabilities. The model has been trained using the [GSM8K dataset](https://huggingface.co/datasets/openai/gsm8k), a benchmark dataset designed to enhance mathematical and logical reasoning skills in language models.
+This fine-tuning process allows Velma-9b to excel at structured problem-solving, step-by-step reasoning, and logical inference, making it an ideal choice for tasks requiring in-depth analytical thinking.
+## Features
+- **Fine-Tuned on GSM8K:** Enhanced for mathematical reasoning and step-by-step logical problem-solving.
+- **Transformer-Based:** Built on the powerful `gemma-2-9b-it` architecture.
+- **Optimized for SFT (Supervised Fine-Tuning):** Fine-tuned using [TRL (Transformer Reinforcement Learning)](https://github.com/huggingface/trl) for improved inference and structured output generation.
+- **Efficient Deployment:** Compatible with `transformers` and supports GPU acceleration for fast inference.
+## Quick Start
+Use the following code to generate text using Velma-9b:
 ```python
 from transformers import pipeline
+# Initialize the pipeline
 generator = pipeline("text-generation", model="AmirMohseni/Velma-9b", device="cuda")
+# Example prompt
+question = "If you had a time machine but could only go to the past or the future once and never return, which would you choose and why?"
+# Generate output
 output = generator([{"role": "user", "content": question}], max_new_tokens=128, return_full_text=False)[0]
 print(output["generated_text"])
+## Training Procedure
+Velma-9b was fine-tuned using the **Supervised Fine-Tuning (SFT)** approach with the GSM8K dataset. This dataset contains high-quality mathematical and logical reasoning problems that help models develop structured thinking and problem-solving skills.
+### Training Details
+- **Base Model:** [google/gemma-2-9b-it](https://huggingface.co/google/gemma-2-9b-it)
+- **Dataset Used:** [GSM8K](https://huggingface.co/datasets/openai/gsm8k)
+- **Fine-Tuning Method:** SFT(STaR) using [TRL](https://github.com/huggingface/trl)
+- **Optimization Objective:** Supervised fine-tuning to enhance structured reasoning
+### Framework Versions
+- **TRL:** `0.12.1`
+- **Transformers:** `4.46.3`
+- **PyTorch:** `2.1.1`
+- **Datasets:** `3.1.0`
+- **Tokenizers:** `0.20.3`
+## Use Cases
+Velma-9b is best suited for tasks requiring structured reasoning and logical inference:
+- **Mathematical & Logical Reasoning Tasks:** Providing step-by-step explanations and structured problem-solving.
+- **Education & Tutoring Applications:** Assisting students with detailed, logic-driven answers.
+- **AI Research & Experimentation:** Evaluating fine-tuning strategies for reasoning-focused language models.