YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

Phi-4-mini

Run Phi-4-mini optimized for Qualcomm NPUs with nexaSDK.

Quickstart

  1. Install nexaSDK and create a free account at sdk.nexa.ai

  2. Activate your device with your access token:

    nexa config set license '<access_token>'
    
  3. Run the model on Qualcomm NPU in one line:

    nexa infer NexaAI/phi4-mini-npu-turbo
    

Model Description

Phi-4-mini is a ~3.8B-parameter instruction-tuned model from Microsoft’s Phi-4 family. Trained on a blend of synthetic “textbook-style” data, filtered public web content, curated books/Q&A, and high-quality supervised chat data, it emphasizes reasoning-dense capabilities while maintaining a compact footprint. This NPU Turbo build uses Nexa’s Qualcomm backend (QNN/Hexagon) to deliver lower latency and higher throughput on-device, with support for 128K context and efficient long-context memory handling.

Features

  • Lightweight yet capable: strong reasoning (math/logic) in a compact 3.8B model.
  • Instruction-following: enhanced SFT + DPO alignment for reliable chat.
  • Content generation: drafting, completion, summarization, code comments, and more.
  • Conversational AI: context-aware assistants/agents with long-context support (128K).
  • NPU-Turbo path: INT8/INT4 quantization, op fusion, and KV-cache residency for Snapdragon® NPUs via nexaSDK.
  • Customizable: fine-tune/adapt for domain-specific or enterprise use.

Use Cases

  • Personal & enterprise chatbots
  • On-device/offline assistants (latency-bound scenarios)
  • Document/report/email summarization
  • Education, tutoring, and STEM reasoning tools
  • Vertical applications (e.g., healthcare, finance, legal) with appropriate safeguards

Inputs and Outputs

Input:

  • Text prompts or conversation history (chat-format, tokenized sequences).

Output:

  • Generated text: responses, explanations, or creative content.
  • Optionally: raw logits/probabilities for advanced downstream tasks.

License

References

📰 Phi-4-mini Microsoft Blog
📖 Phi-4-mini Technical Report
👩‍🍳 Phi Cookbook
🚀 Model paper

Downloads last month
34
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Collection including NexaAI/phi4-mini-npu-turbo