Qwen2.5-7B-Instruct GGUF Model

This is a GGUF format version of the Qwen2.5-7B-Instruct model, converted from the original Hugging Face model. The model is designed for instruction-following and chat applications.

Model Details

  • Model Type: Instruction-tuned Large Language Model
  • Base Model: Qwen/Qwen2.5-7B-Instruct
  • Format: GGUF (optimized for CPU inference)
  • Context Length: 8192 tokens
  • Language: English & Chinese (Bilingual)

Usage

This model can be used with llama.cpp for efficient CPU inference:

# Download the model
git lfs install
git clone https://huggingface.co/Second-Me/Qwen2.5-7B-Instruct-GGUF

# Run inference
./main -m Qwen2.5-7B-Instruct-GGUF/model.gguf -n 1024 --repeat_penalty 1.1 --color -i -r "User:" -f prompts/chat.txt

Prompt Format

The model follows this conversation format:

User: <user input>
Assistant: <model response>

License

This model inherits the Apache 2.0 license from the original Qwen2.5-7B-Instruct model.

Acknowledgments

  • Original model by Qwen team at Alibaba Cloud
  • GGUF conversion by Second-Me team
Downloads last month
1
GGUF
Model size
7.62B params
Architecture
qwen2
Hardware compatibility
Log In to view the estimation
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support