NexaAI
/

Qwen3-4B-Instruct-2507-npu

Model card Files Files and versions

Qwen3-4B-Instruct-2507-npu / README.md

zackli4ai's picture

Update README.md

f8826c8 verified 24 days ago

|

history blame contribute delete

1.97 kB

	# Qwen3-4B-Instruct-2507

	## Model Description
	Qwen3-4B-Instruct-2507 is an updated non-thinking variant in the Qwen3 family, designed for instruction-following tasks without generating `<think></think>` reasoning blocks.
	Trained for enhanced general capabilities—including logic, coding, math, science, and long-tail multilingual knowledge—while natively supporting sprawling 256K-token contexts.

	## Features
	- Instruction-tuned performance: Strong at prompts, logic, comprehension, coding.
	- Multilingual strength: Expanded long-tail coverage across many languages.
	- Massive context window: Handles up to 262,144 tokens natively.
	- Clean output: No thinking-mode parsing needed—just straight responses.

	## Use Cases
	- High-quality conversational agents and instruction following
	- Processing long documents, books, legal texts, and source code
	- Multilingual tasks or low-resource language scenarios

	## Inputs and Outputs
	Input: Text prompts—questions, commands, code tasks—without any special thinking mode flags.
	Output: Direct, context-aware responses—answers, explanations, code—with no internal thought annotations.

	---

	## How to use

	> ⚠️ Hardware requirement: the model currently runs only on Qualcomm NPUs (e.g., Snapdragon-powered AIPC).
	> Apple NPU support is planned next.

	### 1) Install Nexa-SDK

	- Download and follow the steps under "Deploy Section" Nexa's model page: [Download Windows arm64 SDK](https://sdk.nexa.ai/model/Qwen3-4B-Instruct-2507)
	- (Other platforms coming soon)

	### 2) Get an access token
	Create a token in the Model Hub, then log in:

	```bash
	nexa config set license '<access_token>'
	```

	### 3) Run the model
	Running:

	```bash
	nexa infer NexaAI/Qwen3-4B-Instruct-2507-npu
	```

	---


	## License
	- Licensed under Apache-2.0

	## References
	- Model card: [https://huggingface.co/Qwen/Qwen3-4B-Instruct-2507](https://huggingface.co/Qwen/Qwen3-4B-Instruct-2507)