--- license: mit library_name: llama.cpp tags: - mistral - gguf - onnx - quantized - edge-llm - raspberry-pi - local-inference model_creator: mistralai language: en --- # Mistral-7B-Instruct-v0.2: Local LLM Model Repository ... This repository provides quantized GGUF and ONNX exports of [Mistral-7B-Instruct-v0.2](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2), optimized for efficient local inference—especially on resource-constrained devices like Raspberry Pi. --- ## 🦙 GGUF Model (Q8_0) **Filename:** `mistral-7b-instruct-v0.2.Q8_0-Q8_0.gguf` **Format:** GGUF (Q8_0) **Best for:** `llama.cpp`, `koboldcpp`, LM Studio, and similar tools. ### Quick Start ```bash ./main -m mistral-7b-instruct-v0.2.Q8_0-Q8_0.gguf -p "Hello, world!" ``` This quantized GGUF model is designed for fast, memory-efficient inference on local hardware, including Raspberry Pi and other edge devices. --- ## 🟦 ONNX Model **Filename:** `mistral-7b-instruct-v0.2.onnx` **Format:** ONNX **Best for:** ONNX Runtime, Kleidi AI, and compatible frameworks. ### Quick Start ```python import onnxruntime as ort session = ort.InferenceSession("mistral-7b-instruct-v0.2.onnx") # ... inference code here ... ``` The ONNX export enables efficient inference on CPUs, GPUs, and accelerators—ideal for local deployment. --- ## 📋 Credits - **Base model:** [Mistral AI](https://mistral.ai/) - **Quantization:** [llama.cpp](https://github.com/ggerganov/llama.cpp) - **ONNX export:** [Optimum](https://github.com/huggingface/optimum), [ONNX Runtime](https://github.com/microsoft/onnxruntime) --- **Maintainer:** [Makatia](https://huggingface.co/Makatia)