YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

Phi-4-mini

Run Phi-4-mini optimized for Qualcomm NPUs with nexaSDK.

Quickstart

Install nexaSDK and create a free account at sdk.nexa.ai

Activate your device with your access token:

nexa config set license '<access_token>'

Run the model on Qualcomm NPU in one line:
```
nexa infer NexaAI/phi4-mini-npu-turbo
```

Model Description

Phi-4-mini is a ~3.8B-parameter instruction-tuned model from Microsoft’s Phi-4 family. Trained on a blend of synthetic “textbook-style” data, filtered public web content, curated books/Q&A, and high-quality supervised chat data, it emphasizes reasoning-dense capabilities while maintaining a compact footprint. This NPU Turbo build uses Nexa’s Qualcomm backend (QNN/Hexagon) to deliver lower latency and higher throughput on-device, with support for 128K context and efficient long-context memory handling.

Features

Lightweight yet capable: strong reasoning (math/logic) in a compact 3.8B model.
Instruction-following: enhanced SFT + DPO alignment for reliable chat.
Content generation: drafting, completion, summarization, code comments, and more.
Conversational AI: context-aware assistants/agents with long-context support (128K).
NPU-Turbo path: INT8/INT4 quantization, op fusion, and KV-cache residency for Snapdragon® NPUs via nexaSDK.
Customizable: fine-tune/adapt for domain-specific or enterprise use.

Use Cases

Personal & enterprise chatbots
On-device/offline assistants (latency-bound scenarios)
Document/report/email summarization
Education, tutoring, and STEM reasoning tools
Vertical applications (e.g., healthcare, finance, legal) with appropriate safeguards

Inputs and Outputs

Input:

Text prompts or conversation history (chat-format, tokenized sequences).

Output:

Generated text: responses, explanations, or creative content.
Optionally: raw logits/probabilities for advanced downstream tasks.

License

Licensed under: MIT License

References

📰 Phi-4-mini Microsoft Blog
📖 Phi-4-mini Technical Report
👩‍🍳 Phi Cookbook
🚀 Model paper

Downloads last month: 34

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Collection including NexaAI/phi4-mini-npu-turbo

Qualcomm NPU

Collection

Latest SOTA models supported on Qualcomm NPU. • 13 items • Updated 2 days ago • 1