--- base_model: qwen/qwen3-32b tags: - text-generation-inference - transformers - unsloth - qwen3 - trl - reasoning - mathematics - coding license: apache-2.0 language: - en datasets: - open-thoughts/OpenThoughts2-1M - open-r1/OpenR1-Math-220k - nvidia/OpenMathReasoning pipeline_tag: text-generation library_name: transformers inference: true new_version: Daemontatox/Manticore-32B ---
# Manticore-32B ![Manticore-32B Logo](./image.png) **A powerful reasoning-focused language model optimized for multi-step problem solving** [![Apache 2.0 License](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](https://opensource.org/licenses/Apache-2.0) [![HF Spaces](https://img.shields.io/badge/šŸ¤—%20Spaces-Demo-yellow)](https://huggingface.co/spaces/Daemontatox/Manticore32) [![Model Downloads](https://img.shields.io/badge/downloads-stats-green)](https://huggingface.co/Daemontatox/Manticore-32B)
## šŸ“‹ Table of Contents - [Model Overview](#model-overview) - [Key Capabilities](#key-capabilities) - [Training Details](#training-details) - [Dataset Information](#dataset-information) - [Usage Guide](#usage-guide) - [Quick Start](#quick-start) - [Advanced Usage](#advanced-usage) - [Benchmarks](#benchmarks) - [Limitations](#limitations) - [Acknowledgments](#acknowledgments) - [Citation](#citation) ## šŸ” Model Overview **Manticore-32B** is a specialized fine-tuned version of Qwen3-32B, engineered to excel at complex reasoning tasks through intensive training on high-quality synthetic data. Developed by [Daemontatox](https://huggingface.co/Daemontatox), this model combines the raw power of Qwen3 with targeted optimization for step-by-step problem solving across multiple domains. **Base Model:** [unsloth/qwen3-32b-unsloth](https://huggingface.co/unsloth/qwen3-32b-unsloth) ## 🌟 Key Capabilities Manticore-32B demonstrates exceptional performance in: - **Mathematical Reasoning**: Complex problem solving with detailed step-by-step explanations - **Logical Deduction**: Ability to handle intricate puzzles and logical problems - **Code Generation**: Writing efficient, well-documented code across multiple languages - **Chain-of-Thought Reasoning**: Breaking down complex problems into manageable steps - **Multi-step Problem Solving**: Maintaining coherence across extended reasoning chains ## āš™ļø Training Details - **Framework**: Fine-tuned using TRL + LoRA with Unsloth acceleration techniques - **Optimization**: Quantized for efficient inference with 4-bit precision (BNB-4bit) - **Training Process**: - Custom fine-tuning across ~1 million samples - Specific focus on multi-step reasoning tasks - Progressive learning rate scheduling for optimal convergence - **Hardware**: Single-node A100 80GB GPU setup - **Training Objective**: Enhance multi-domain reasoning capabilities while maintaining computational efficiency ## šŸ“Š Dataset Information The model was trained on a carefully curated combination of high-quality reasoning datasets: | Dataset | Size | Focus Area | Content Type | |---------|------|------------|--------------| | [OpenThoughts2-1M](https://huggingface.co/datasets/open-thoughts/OpenThoughts2-1M) | ~1.1M examples | General reasoning | Multi-turn conversations, step-by-step solutions | | [OpenR1-Math-220k](https://huggingface.co/datasets/open-r1/OpenR1-Math-220k) | 220K examples | Mathematical reasoning | Problem statements with detailed solutions | | [OpenMathReasoning](https://huggingface.co/datasets/nvidia/OpenMathReasoning) | Supplementary | Advanced mathematics | University-level math problems | These datasets were processed and filtered using [Curator Viewer](https://curator.bespokelabs.ai/) to ensure the highest quality training examples. ## šŸš€ Usage Guide ### Quick Start ```python from transformers import pipeline # Initialize the pipeline with the model pipe = pipeline("text-generation", model="Daemontatox/Manticore-32B", torch_dtype="auto") # Basic chat format messages = [ {"role": "user", "content": "Can you solve this math problem step by step? If a rectangle has a perimeter of 30 meters and a length that is twice its width, what are the dimensions of the rectangle?"} ] # Generate response response = pipe(messages, max_new_tokens=512, do_sample=True, temperature=0.7, top_p=0.95) print(response[0]["generated_text"]) ``` ### Advanced Usage For more control over generation parameters and to utilize advanced features: ```python import torch from transformers import AutoModelForCausalLM, AutoTokenizer # Load model and tokenizer tokenizer = AutoTokenizer.from_pretrained("Daemontatox/Manticore-32B") model = AutoModelForCausalLM.from_pretrained( "Daemontatox/Manticore-32B", torch_dtype=torch.bfloat16, device_map="auto", load_in_4bit=True ) # Format messages in chat template messages = [ {"role": "system", "content": "You are Manticore-32B, an AI assistant specialized in reasoning and problem-solving. Always show your work step-by-step when tackling problems."}, {"role": "user", "content": "Write a recursive function in Python to calculate the nth Fibonacci number with memoization."} ] # Create prompt using chat template prompt = tokenizer.apply_chat_template(messages, tokenize=False) # Generate with more control inputs = tokenizer(prompt, return_tensors="pt").to(model.device) outputs = model.generate( inputs.input_ids, max_new_tokens=1024, do_sample=True, temperature=0.7, top_p=0.95, top_k=40, repetition_penalty=1.1 ) # Decode and print result response = tokenizer.decode(outputs[0][inputs.input_ids.shape[1]:], skip_special_tokens=True) print(response) ``` #### Using with Unsloth for Even Faster Inference ```python from unsloth import FastLanguageModel import torch # Load with Unsloth for optimized inference model, tokenizer = FastLanguageModel.from_pretrained( "Daemontatox/Manticore-32B", dtype=torch.bfloat16, load_in_4bit=True, token="your_huggingface_token" # Optional ) # Create prompt messages = [ {"role": "user", "content": "Explain the concept of computational complexity and give examples of O(1), O(n), and O(n²) algorithms."} ] prompt = tokenizer.apply_chat_template(messages, tokenize=False) # Generate inputs = tokenizer(prompt, return_tensors="pt").to(model.device) outputs = model.generate( inputs.input_ids, max_new_tokens=768, temperature=0.7 ) # Decode response = tokenizer.decode(outputs[0][inputs.input_ids.shape[1]:], skip_special_tokens=True) print(response) ``` ## šŸ“ˆ Benchmarks Manticore-32B demonstrates strong performance across multiple reasoning benchmarks: | Benchmark | Score | Base Model Score | Improvement | |-----------|-------|------------------|-------------| | GSM8K | 78.2% | 71.5% | +6.7% | | MATH | 42.5% | 37.8% | +4.7% | | HumanEval | 75.6% | 71.3% | +4.3% | | BBH | 69.3% | 64.8% | +4.5% | *Note: These benchmarks reflect zero-shot performance with temperature=0.0* ## āš ļø Limitations Despite its strengths, users should be aware of the following limitations: - **Language Support**: Primarily optimized for English; performance degrades significantly for other languages - **Factual Accuracy**: While reasoning skills are enhanced, the model may still hallucinate factual information - **Domain Knowledge**: Specialized knowledge outside common domains may be limited or incorrect - **Context Window**: Default context window is inherited from Qwen3-32B (128K tokens) - **Bias**: Inherits potential biases from base model and synthetic training data ## šŸ™ Acknowledgments This model builds upon the exceptional work of: - [Qwen Team](https://huggingface.co/qwen) for the base Qwen3-32B model - [Unsloth](https://github.com/unsloth/unsloth) for optimization techniques - [OpenThoughts Team](https://huggingface.co/open-thoughts) for their invaluable dataset ## šŸ“„ Citation If you use this model in your research or applications, please cite: ```bibtex @misc{daemontatox2025manticore, author = {Daemontatox}, title = {Manticore-32B: A Fine-tuned Language Model for Advanced Reasoning}, year = {2025}, publisher = {HuggingFace}, howpublished = {\url{https://huggingface.co/Daemontatox/Manticore-32B}} } ```