metadata
license: mit
library_name: llama.cpp
tags:
- mistral
- gguf
- onnx
- quantized
- edge-llm
- raspberry-pi
- local-inference
model_creator: mistralai
language: en
Mistral-7B-Instruct-v0.2: Local LLM Model Repository
...
This repository provides quantized GGUF and ONNX exports of Mistral-7B-Instruct-v0.2, optimized for efficient local inference—especially on resource-constrained devices like Raspberry Pi.
🦙 GGUF Model (Q8_0)
Filename: mistral-7b-instruct-v0.2.Q8_0-Q8_0.gguf
Format: GGUF (Q8_0)
Best for: llama.cpp
, koboldcpp
, LM Studio, and similar tools.
Quick Start
./main -m mistral-7b-instruct-v0.2.Q8_0-Q8_0.gguf -p "Hello, world!"
This quantized GGUF model is designed for fast, memory-efficient inference on local hardware, including Raspberry Pi and other edge devices.
🟦 ONNX Model
Filename: mistral-7b-instruct-v0.2.onnx
Format: ONNX
Best for: ONNX Runtime, Kleidi AI, and compatible frameworks.
Quick Start
import onnxruntime as ort
session = ort.InferenceSession("mistral-7b-instruct-v0.2.onnx")
# ... inference code here ...
The ONNX export enables efficient inference on CPUs, GPUs, and accelerators—ideal for local deployment.
📋 Credits
- Base model: Mistral AI
- Quantization: llama.cpp
- ONNX export: Optimum, ONNX Runtime
Maintainer: Makatia