--- license: apache-2.0 language: - en library_name: gguf pipeline_tag: text-generation tags: - mathematical-reasoning - qwen3 - gguf - quantized - imatrix - importance-matrix - math - reasoning - fine-tuned base_model: PinkPixel/Crystal-Think-V2 quantized_by: PinkPixel ---
Crystal Think V2 Logo
# ๐Ÿง  Crystal Think V2 - GGUF Imatrix Quantized โœจ **Premium Quality GGUF Quantizations with Importance Matrix Optimization** > **๐Ÿ”— Original Model:** [PinkPixel/Crystal-Think-V2](https://huggingface.co/PinkPixel/Crystal-Think-V2) > **๐Ÿ“ฆ Quantized by:** Pink Pixel > **๐Ÿท๏ธ License:** Apache 2.0 > **๐ŸŽฏ Special Feature:** Importance Matrix Enhanced --- ## ๐Ÿ“‹ About This Repository This repository contains **premium GGUF quantized versions** of Crystal Think V2, enhanced with **Importance Matrix (imatrix)** optimization. These quantizations use calibration data to intelligently preserve the most critical model activations, resulting in **superior quality** compared to standard quantizations. ### ๐ŸŒŸ **What is Importance Matrix?** **Importance Matrix** is an advanced quantization technique that: - ๐Ÿ“Š **Analyzes activation patterns** using calibration data - ๐ŸŽฏ **Identifies critical neurons** that most impact model performance - ๐Ÿ”ง **Preserves precision** where it matters most - โšก **Maintains efficiency** while maximizing quality retention **Result:** Better mathematical reasoning performance at the same file sizes! ๐Ÿš€ ### ๐ŸŽฏ Original Model Features - ๐Ÿงฎ **Advanced Mathematical Reasoning** with enhanced chain-of-thought - ๐Ÿ“ **Multi-step Problem Solving** with clear explanations - ๐Ÿ’ป **Mathematical Code Generation** and algorithm explanation - ๐ŸŽฏ **Enhanced `` Reasoning Format** - ๐Ÿ“Š **85.2% GSM8K accuracy** (+8.8% over base Qwen3-4B) --- ## ๐Ÿ“ฆ Available Imatrix Quantizations | Quantization | File Size | Use Case | Memory Required | Quality vs Standard | |-------------|-----------|----------|-----------------|-------------------| | **IQ4_XS** | 2.1GB | Ultra-efficient | ~5.5GB RAM | +3-5% better | | **Q4_K_S** | 2.2GB | Small & fast | ~6GB RAM | +2-4% better | | **IQ4_NL** | 2.2GB | Natural language optimized | ~6GB RAM | +4-6% better | | **Q4_K_M** | 2.3GB | Balanced performance | ~6.5GB RAM | +3-5% better | | **Q5_K_S** | 2.6GB | High quality small | ~7GB RAM | +2-3% better | | **Q5_K_M** | 2.7GB | **RECOMMENDED** | ~7.5GB RAM | +2-4% better | ### ๐Ÿ’ก **Quantization Guide:** - **IQ4_XS** - Smallest size with imatrix benefits - **IQ4_NL** - Optimized for natural language tasks (math word problems!) - **Q4_K_M** - **Best balance** of size and quality improvement - **Q5_K_M** - **Recommended choice** for most users - excellent quality retention --- ## ๐Ÿš€ Quick Start ### Using llama.cpp ```bash # Download your preferred imatrix quantization wget https://huggingface.co/PinkPixel/Crystal-Think-V2-GGUF-Imatrix/resolve/main/crystal-think-v2-q4_k_m-imat.gguf # Run with llama.cpp ./llama.cpp/main -m crystal-think-v2-q4_k_m-imat.gguf -p "Solve this step by step: If x + 2y = 10 and 2x - y = 5, find x and y." -n 512 ``` ### Using llama-cpp-python ```python from llama_cpp import Llama # Load the imatrix model llm = Llama( model_path="crystal-think-v2-q5_k_m-imat.gguf", n_ctx=4096, # Context length n_threads=8, # CPU threads verbose=False ) # Mathematical reasoning example prompt = """Solve this step by step: A circular garden has a radius of 8 meters. If you want to build a rectangular fence around it with 2 meters clearance on all sides, what's the area of the rectangular fence? Use for your reasoning.""" response = llm( prompt, max_tokens=512, temperature=0.7, stop=["", "<|endoftext|>"] ) print(response["choices"][0]["text"]) ``` ### Using Ollama ```bash # Create Modelfile echo 'FROM ./crystal-think-v2-q5_k_m-imat.gguf' > Modelfile # Create Ollama model ollama create crystal-think-v2-imat -f Modelfile # Run the model ollama run crystal-think-v2-imat "What is the integral of sin(x)cos(x)?" ``` --- ## ๐ŸŽฏ Enhanced Reasoning Format Crystal Think V2 uses a structured reasoning approach, perfectly preserved with imatrix: ``` [Step-by-step reasoning process] - Problem analysis and variable identification - Mathematical equation setup - Systematic solution steps - Verification and checking [Final organized answer] 1) Clear results with explanations 2) Numerical values with proper units 3) Context and practical interpretation ``` --- ## ๐Ÿ“Š Performance Benchmarks ### Original Model Performance | Benchmark | Score | Improvement over Base | |-----------|-------|----------------------| | **GSM8K** | 85.2% | +8.8% | | **MATH** | 42.1% | +10.4% | | **Algebra** | 78.9% | +13.7% | | **Geometry** | 71.3% | +12.5% | | **Code Math** | 82.6% | +13.5% | ### Imatrix vs Standard GGUF Comparison | Quantization | Standard GGUF | Imatrix GGUF | Improvement | |-------------|---------------|--------------|-------------| | **Q4_K_M** | ~92% orig. | ~95-97% orig. | **+3-5%** | | **Q5_K_M** | ~95% orig. | ~97-99% orig. | **+2-4%** | | **IQ4_NL** | N/A | ~94-96% orig. | **New format** | | **IQ4_XS** | N/A | ~91-93% orig. | **Smallest size** | ### ๐ŸŽฏ **Why Imatrix is Better:** - **Smarter quantization** - Preserves critical mathematical reasoning paths - **Better accuracy** - Maintains performance on complex multi-step problems - **Consistent quality** - Less degradation on edge cases and difficult problems --- ## ๐Ÿ’ป Hardware Requirements ### Minimum Requirements | Quantization | RAM | VRAM (GPU) | CPU | |-------------|-----|-----------|-----| | IQ4_XS | 5.5GB | 3.5GB | 4 cores | | Q4_K_S | 6GB | 4GB | 4 cores | | IQ4_NL | 6GB | 4GB | 4 cores | | Q4_K_M | 6.5GB | 4.5GB | 4 cores | | Q5_K_S | 7GB | 5GB | 6 cores | | Q5_K_M | 7.5GB | 5.5GB | 6 cores | ### Recommended for Best Performance - **CPU**: Modern 8+ core processor (AMD Ryzen 7/Intel i7 or better) - **RAM**: 16GB+ system memory - **GPU**: 8GB+ VRAM (RTX 4070/RX 7800 XT or better for GPU acceleration) --- ## ๐Ÿ”ง Installation & Dependencies ### llama.cpp (Latest Version Recommended) ```bash git clone https://github.com/ggerganov/llama.cpp cd llama.cpp make # For GPU support make LLAMA_CUBLAS=1 ``` ### llama-cpp-python ```bash pip install llama-cpp-python # For GPU support (CUDA) CMAKE_ARGS="-DLLAMA_CUBLAS=on" pip install llama-cpp-python # For GPU support (ROCm/AMD) CMAKE_ARGS="-DLLAMA_HIPBLAS=on" pip install llama-cpp-python ``` ### Ollama ```bash # Install Ollama curl -fsSL https://ollama.com/install.sh | sh ``` --- ## ๐Ÿ“š Advanced Usage Examples ### Complex Mathematical Reasoning ``` Input: "A projectile is launched at 45ยฐ with initial velocity 50 m/s. Calculate the maximum height, range, and time of flight. Use g = 9.8 m/sยฒ." Expected: Detailed physics solution with kinematic equations ``` ### Multi-step Algebra ``` Input: "Solve the system of equations: 2x + 3y - z = 7, x - 2y + 4z = -3, 3x + y + 2z = 10" Expected: Systematic solution using elimination or substitution ``` ### Calculus Problem ``` Input: "Find the area between the curves y = xยฒ and y = 4x - xยฒ from x = 0 to x = 4" Expected: Step-by-step integration with proper setup ``` --- ## ๐Ÿ” Quality Comparison Test Test the imatrix advantage with this challenging problem: ``` Prompt: "A cylindrical tank with radius 3m and height 8m is filled with water to 75% capacity. If water is drained at a rate of 2mยณ/min, how long will it take to empty the tank completely? Also calculate the water level after 30 minutes of draining." Expected Results: - Initial volume calculation: ฯ€ ร— 3ยฒ ร— 8 ร— 0.75 = 54ฯ€ mยณ - Time to empty: 27ฯ€ minutes โ‰ˆ 84.8 minutes - Water level after 30 min: ~4.4 meters Imatrix models should show cleaner reasoning and more accurate intermediate steps! ``` --- ## ๐Ÿ”— Related Links - **๐Ÿ  Original Model:** [PinkPixel/Crystal-Think-V2](https://huggingface.co/PinkPixel/Crystal-Think-V2) - **๐Ÿ“– Model Documentation:** [Crystal Think V2 README](https://huggingface.co/PinkPixel/Crystal-Think-V2/blob/main/README.md) - **๐Ÿ”ง Standard GGUF:** [Crystal Think V2 GGUF](https://huggingface.co/PinkPixel/Crystal-Think-V2-GGUF) - **๐Ÿ› ๏ธ llama.cpp:** [GitHub Repository](https://github.com/ggerganov/llama.cpp) - **๐Ÿ llama-cpp-python:** [PyPI Package](https://pypi.org/project/llama-cpp-python/) --- ## โš ๏ธ Limitations - **Domain Focus**: Optimized for mathematical reasoning; may be less effective for general conversation - **Calibration Dependency**: Imatrix quality depends on calibration data relevance - **Language**: Primarily trained on English mathematical content - **Hardware Dependency**: Performance varies significantly with hardware specifications --- ## ๐Ÿงช Technical Details ### Imatrix Generation Process 1. **Calibration Data**: Used high-quality mathematical reasoning samples 2. **Activation Analysis**: Measured importance across all model layers 3. **Precision Mapping**: Applied higher precision to critical activations 4. **Quality Validation**: Tested on mathematical benchmarks ### Recommended Use Cases - **Mathematical tutoring systems** - **STEM education applications** - **Research and analysis tools** - **Competitive programming assistance** - **Physics and engineering calculations** --- ## ๐Ÿค Contributing Found an issue with the imatrix quantizations or have suggestions for improvements? Please open an issue or reach out! --- ## ๐Ÿ“ง Contact & Support - **Developer:** Pink Pixel - **GitHub:** [https://github.com/pinkpixel-dev](https://github.com/pinkpixel-dev) - **Website:** [https://pinkpixel.dev](https://pinkpixel.dev) - **Email:** [admin@pinkpixel.dev](mailto:admin@pinkpixel.dev) --- ## ๐Ÿ™ Acknowledgments - **Original Model:** Crystal Think V2 by Pink Pixel - **Base Model:** [Qwen/Qwen3-4B](https://huggingface.co/Qwen/Qwen3-4B) by Qwen Team - **Quantization Tools:** [llama.cpp](https://github.com/ggerganov/llama.cpp) by Georgi Gerganov - **Imatrix Technique:** Advanced quantization methodology for preserving model quality - **Training Dataset:** [NVIDIA OpenMathReasoning](https://huggingface.co/datasets/nvidia/OpenMathReasoning) --- **Made with โค๏ธ by Pink Pixel** โœจ *"Dream it, Pixel it"* > **๐Ÿ’ก Pro Tip:** For the best mathematical reasoning experience, try the **Q5_K_M-imat** or **IQ4_NL-imat** variants - they offer excellent quality retention with the benefits of importance matrix optimization!