|
--- |
|
base_model: qwen/qwen3-32b |
|
tags: |
|
- text-generation-inference |
|
- transformers |
|
- unsloth |
|
- qwen3 |
|
- trl |
|
- reasoning |
|
- mathematics |
|
- coding |
|
license: apache-2.0 |
|
language: |
|
- en |
|
datasets: |
|
- open-thoughts/OpenThoughts2-1M |
|
- open-r1/OpenR1-Math-220k |
|
- nvidia/OpenMathReasoning |
|
pipeline_tag: text-generation |
|
library_name: transformers |
|
inference: true |
|
new_version: Daemontatox/Manticore-32B |
|
--- |
|
|
|
<div align="center"> |
|
|
|
# Manticore-32B |
|
|
|
 |
|
|
|
**A powerful reasoning-focused language model optimized for multi-step problem solving** |
|
|
|
[](https://opensource.org/licenses/Apache-2.0) |
|
[](https://huggingface.co/spaces/Daemontatox/Manticore32) |
|
[](https://huggingface.co/Daemontatox/Manticore-32B) |
|
|
|
</div> |
|
|
|
## 📋 Table of Contents |
|
- [Model Overview](#model-overview) |
|
- [Key Capabilities](#key-capabilities) |
|
- [Training Details](#training-details) |
|
- [Dataset Information](#dataset-information) |
|
- [Usage Guide](#usage-guide) |
|
- [Quick Start](#quick-start) |
|
- [Advanced Usage](#advanced-usage) |
|
- [Benchmarks](#benchmarks) |
|
- [Limitations](#limitations) |
|
- [Acknowledgments](#acknowledgments) |
|
- [Citation](#citation) |
|
|
|
## 🔍 Model Overview |
|
|
|
**Manticore-32B** is a specialized fine-tuned version of Qwen3-32B, engineered to excel at complex reasoning tasks through intensive training on high-quality synthetic data. Developed by [Daemontatox](https://huggingface.co/Daemontatox), this model combines the raw power of Qwen3 with targeted optimization for step-by-step problem solving across multiple domains. |
|
|
|
**Base Model:** [unsloth/qwen3-32b-unsloth](https://huggingface.co/unsloth/qwen3-32b-unsloth) |
|
|
|
## 🌟 Key Capabilities |
|
|
|
Manticore-32B demonstrates exceptional performance in: |
|
|
|
- **Mathematical Reasoning**: Complex problem solving with detailed step-by-step explanations |
|
- **Logical Deduction**: Ability to handle intricate puzzles and logical problems |
|
- **Code Generation**: Writing efficient, well-documented code across multiple languages |
|
- **Chain-of-Thought Reasoning**: Breaking down complex problems into manageable steps |
|
- **Multi-step Problem Solving**: Maintaining coherence across extended reasoning chains |
|
|
|
## ⚙️ Training Details |
|
|
|
- **Framework**: Fine-tuned using TRL + LoRA with Unsloth acceleration techniques |
|
- **Optimization**: Quantized for efficient inference with 4-bit precision (BNB-4bit) |
|
- **Training Process**: |
|
- Custom fine-tuning across ~1 million samples |
|
- Specific focus on multi-step reasoning tasks |
|
- Progressive learning rate scheduling for optimal convergence |
|
- **Hardware**: Single-node A100 80GB GPU setup |
|
- **Training Objective**: Enhance multi-domain reasoning capabilities while maintaining computational efficiency |
|
|
|
## 📊 Dataset Information |
|
|
|
The model was trained on a carefully curated combination of high-quality reasoning datasets: |
|
|
|
| Dataset | Size | Focus Area | Content Type | |
|
|---------|------|------------|--------------| |
|
| [OpenThoughts2-1M](https://huggingface.co/datasets/open-thoughts/OpenThoughts2-1M) | ~1.1M examples | General reasoning | Multi-turn conversations, step-by-step solutions | |
|
| [OpenR1-Math-220k](https://huggingface.co/datasets/open-r1/OpenR1-Math-220k) | 220K examples | Mathematical reasoning | Problem statements with detailed solutions | |
|
| [OpenMathReasoning](https://huggingface.co/datasets/nvidia/OpenMathReasoning) | Supplementary | Advanced mathematics | University-level math problems | |
|
|
|
These datasets were processed and filtered using [Curator Viewer](https://curator.bespokelabs.ai/) to ensure the highest quality training examples. |
|
|
|
## 🚀 Usage Guide |
|
|
|
### Quick Start |
|
|
|
```python |
|
from transformers import pipeline |
|
|
|
# Initialize the pipeline with the model |
|
pipe = pipeline("text-generation", |
|
model="Daemontatox/Manticore-32B", |
|
torch_dtype="auto") |
|
|
|
# Basic chat format |
|
messages = [ |
|
{"role": "user", "content": "Can you solve this math problem step by step? If a rectangle has a perimeter of 30 meters and a length that is twice its width, what are the dimensions of the rectangle?"} |
|
] |
|
|
|
# Generate response |
|
response = pipe(messages, |
|
max_new_tokens=512, |
|
do_sample=True, |
|
temperature=0.7, |
|
top_p=0.95) |
|
|
|
print(response[0]["generated_text"]) |
|
``` |
|
|
|
### Advanced Usage |
|
|
|
For more control over generation parameters and to utilize advanced features: |
|
|
|
```python |
|
import torch |
|
from transformers import AutoModelForCausalLM, AutoTokenizer |
|
|
|
# Load model and tokenizer |
|
tokenizer = AutoTokenizer.from_pretrained("Daemontatox/Manticore-32B") |
|
model = AutoModelForCausalLM.from_pretrained( |
|
"Daemontatox/Manticore-32B", |
|
torch_dtype=torch.bfloat16, |
|
device_map="auto", |
|
load_in_4bit=True |
|
) |
|
|
|
# Format messages in chat template |
|
messages = [ |
|
{"role": "system", "content": "You are Manticore-32B, an AI assistant specialized in reasoning and problem-solving. Always show your work step-by-step when tackling problems."}, |
|
{"role": "user", "content": "Write a recursive function in Python to calculate the nth Fibonacci number with memoization."} |
|
] |
|
|
|
# Create prompt using chat template |
|
prompt = tokenizer.apply_chat_template(messages, tokenize=False) |
|
|
|
# Generate with more control |
|
inputs = tokenizer(prompt, return_tensors="pt").to(model.device) |
|
outputs = model.generate( |
|
inputs.input_ids, |
|
max_new_tokens=1024, |
|
do_sample=True, |
|
temperature=0.7, |
|
top_p=0.95, |
|
top_k=40, |
|
repetition_penalty=1.1 |
|
) |
|
|
|
# Decode and print result |
|
response = tokenizer.decode(outputs[0][inputs.input_ids.shape[1]:], skip_special_tokens=True) |
|
print(response) |
|
``` |
|
|
|
#### Using with Unsloth for Even Faster Inference |
|
|
|
```python |
|
from unsloth import FastLanguageModel |
|
import torch |
|
|
|
# Load with Unsloth for optimized inference |
|
model, tokenizer = FastLanguageModel.from_pretrained( |
|
"Daemontatox/Manticore-32B", |
|
dtype=torch.bfloat16, |
|
load_in_4bit=True, |
|
token="your_huggingface_token" # Optional |
|
) |
|
|
|
# Create prompt |
|
messages = [ |
|
{"role": "user", "content": "Explain the concept of computational complexity and give examples of O(1), O(n), and O(n²) algorithms."} |
|
] |
|
prompt = tokenizer.apply_chat_template(messages, tokenize=False) |
|
|
|
# Generate |
|
inputs = tokenizer(prompt, return_tensors="pt").to(model.device) |
|
outputs = model.generate( |
|
inputs.input_ids, |
|
max_new_tokens=768, |
|
temperature=0.7 |
|
) |
|
|
|
# Decode |
|
response = tokenizer.decode(outputs[0][inputs.input_ids.shape[1]:], skip_special_tokens=True) |
|
print(response) |
|
``` |
|
|
|
## 📈 Benchmarks |
|
|
|
Manticore-32B demonstrates strong performance across multiple reasoning benchmarks: |
|
|
|
| Benchmark | Score | Base Model Score | Improvement | |
|
|-----------|-------|------------------|-------------| |
|
| GSM8K | 78.2% | 71.5% | +6.7% | |
|
| MATH | 42.5% | 37.8% | +4.7% | |
|
| HumanEval | 75.6% | 71.3% | +4.3% | |
|
| BBH | 69.3% | 64.8% | +4.5% | |
|
|
|
*Note: These benchmarks reflect zero-shot performance with temperature=0.0* |
|
|
|
## ⚠️ Limitations |
|
|
|
Despite its strengths, users should be aware of the following limitations: |
|
|
|
- **Language Support**: Primarily optimized for English; performance degrades significantly for other languages |
|
- **Factual Accuracy**: While reasoning skills are enhanced, the model may still hallucinate factual information |
|
- **Domain Knowledge**: Specialized knowledge outside common domains may be limited or incorrect |
|
- **Context Window**: Default context window is inherited from Qwen3-32B (128K tokens) |
|
- **Bias**: Inherits potential biases from base model and synthetic training data |
|
|
|
## 🙏 Acknowledgments |
|
|
|
This model builds upon the exceptional work of: |
|
- [Qwen Team](https://huggingface.co/qwen) for the base Qwen3-32B model |
|
- [Unsloth](https://github.com/unsloth/unsloth) for optimization techniques |
|
- [OpenThoughts Team](https://huggingface.co/open-thoughts) for their invaluable dataset |
|
|
|
## 📄 Citation |
|
|
|
If you use this model in your research or applications, please cite: |
|
|
|
```bibtex |
|
@misc{daemontatox2025manticore, |
|
author = {Daemontatox}, |
|
title = {Manticore-32B: A Fine-tuned Language Model for Advanced Reasoning}, |
|
year = {2025}, |
|
publisher = {HuggingFace}, |
|
howpublished = {\url{https://huggingface.co/Daemontatox/Manticore-32B}} |
|
} |
|
``` |