YAML Metadata
Warning:
empty or missing yaml metadata in repo card
(https://huggingface.co/docs/hub/model-cards#model-card-metadata)
Qwen-GRPO-Training Model
This is a fine-tuned version of the Qwen model, trained using the GRPO dataset along with the 1.NuminaMath-TIR (For R1 Zero Training) 2.Bespoke-Stratos-17k (For R1 Training) datasets. It is designed for high-performance causal language modeling tasks.
Model Details
- Model Type: Causal Language Model (CausalLM)
- Datasets Used: GRPO, NuminaMath, Bespoke Stratos
- Training Objective: Fine-tuned for general language understanding and specialized knowledge in mathematics, engineering, and technical domains.
Usage
To use this model, you can load it using the transformers
library:
Installation
Make sure you have the necessary libraries installed:
pip install transformers torch
Example Code
Here’s a quick example to load and use the model for text generation:
from transformers import AutoTokenizer, AutoModelForCausalLM
import torch
# Replace with your Hugging Face repo ID
model_id = "joe-xhedi/Qwen-GRPO-training"
# Load the tokenizer
tokenizer = AutoTokenizer.from_pretrained(
model_id,
trust_remote_code=True,
padding_side="right"
)
if tokenizer.pad_token is None:
tokenizer.pad_token = tokenizer.eos_token
# Load the model
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model = AutoModelForCausalLM.from_pretrained(
model_id,
trust_remote_code=True,
torch_dtype=torch.bfloat16
).to(device)
# Example usage (assuming you have a 'messages' list prepared)
inputs = tokenizer("Your prompt here", return_tensors="pt").to(device)
outputs = model.generate(**inputs)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response)
Model Parameters
- Tokenizer: Loads a pre-trained tokenizer specific to this model.
- Model: This is the causal language model that has been fine-tuned for high-quality text generation.
- Device: The model will automatically run on the GPU if available.
Notes
- The
trust_remote_code=True
flag allows the tokenizer and model to execute code from the repository. Use with caution. - The model uses
torch_dtype=torch.bfloat16
for better memory optimization.
- Downloads last month
- 3
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support