zackli4ai's picture
Update README.md
f8826c8 verified
# Qwen3-4B-Instruct-2507
## Model Description
**Qwen3-4B-Instruct-2507** is an updated non-thinking variant in the Qwen3 family, designed for instruction-following tasks without generating `<think></think>` reasoning blocks.
Trained for enhanced general capabilities—including logic, coding, math, science, and long-tail multilingual knowledge—while natively supporting sprawling 256K-token contexts.
## Features
- **Instruction-tuned performance**: Strong at prompts, logic, comprehension, coding.
- **Multilingual strength**: Expanded long-tail coverage across many languages.
- **Massive context window**: Handles up to 262,144 tokens natively.
- **Clean output**: No thinking-mode parsing needed—just straight responses.
## Use Cases
- High-quality conversational agents and instruction following
- Processing long documents, books, legal texts, and source code
- Multilingual tasks or low-resource language scenarios
## Inputs and Outputs
**Input**: Text prompts—questions, commands, code tasks—without any special thinking mode flags.
**Output**: Direct, context-aware responses—answers, explanations, code—with no internal thought annotations.
---
## How to use
> ⚠️ **Hardware requirement:** the model currently runs **only on Qualcomm NPUs** (e.g., Snapdragon-powered AIPC).
> Apple NPU support is planned next.
### 1) Install Nexa-SDK
- Download and follow the steps under "Deploy Section" Nexa's model page: [Download Windows arm64 SDK](https://sdk.nexa.ai/model/Qwen3-4B-Instruct-2507)
- (Other platforms coming soon)
### 2) Get an access token
Create a token in the Model Hub, then log in:
```bash
nexa config set license '<access_token>'
```
### 3) Run the model
Running:
```bash
nexa infer NexaAI/Qwen3-4B-Instruct-2507-npu
```
---
## License
- Licensed under **Apache-2.0**
## References
- Model card: [https://huggingface.co/Qwen/Qwen3-4B-Instruct-2507](https://huggingface.co/Qwen/Qwen3-4B-Instruct-2507)