Llama-3.1-8B-tulu3-mixture-math-reasoning-full-muon
This is a fine-tuned version of Llama 3.1 8B, trained on a mixture of math reasoning tasks using the Tulu3 approach.
Model Details
- Base Model: Meta-Llama-3.1-8B
- Architecture: LlamaForCausalLM
- Parameters: ~8B
- Training: Fine-tuned with LoRA/QLoRA techniques
- Checkpoint: 2611
- Training Configuration:
- Effective batch size: 128
- Learning rate: 5e-05
- Method: Full parameter tuning with Muon optimizer
Model Configuration
- Vocabulary Size: 128,256
- Hidden Size: 4096
- Number of Layers: 32
- Number of Attention Heads: 32
- Max Position Embeddings: 131,072
- RoPE Scaling: Llama3 with factor 8.0
Usage
from transformers import AutoModelForCausalLM, AutoTokenizer
model_name = "pmahdavi/Llama-3.1-8B-tulu3-mixture-math-reasoning-full-muon"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(
model_name,
torch_dtype=torch.bfloat16,
device_map="auto"
)
# Example usage
prompt = "Solve this math problem: What is 2x + 5 = 11?"
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=256)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response)
Training Details
This model was fine-tuned using LLaMA-Factory with:
- Mixed precision training (bfloat16)
- Gradient checkpointing
- Custom mixture of math reasoning datasets
- Tulu3 methodology for instruction following
Limitations
- This model is designed for mathematical reasoning tasks
- May not perform as well on general conversation or other domains
- Inherits the limitations of the base Llama 3.1 model
Citation
If you use this model, please cite the original Llama 3.1 paper and the Tulu3 methodology.
- Downloads last month
- -
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support
Model tree for pmahdavi/Llama-3.1-8B-tulu3-mixture-math-reasoning-full-muon
Base model
meta-llama/Llama-3.1-8B