YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

Llama-3.2-1B

Run Llama-3.2-1B optimized for Intel NPUs with nexaSDK.

Quickstart

Install nexaSDK and create a free account at sdk.nexa.ai

Activate your device with your access token:

nexa config set license '<access_token>'

Run the model on NPU in one line:

nexa infer NexaAI/llama3.2-1B-intel-npu

Model Description

Llama-3.2-1B is the smallest model in the Llama 3.2 family, optimized for efficiency and ultra-lightweight deployment.
With just 1B parameters, it enables fast inference on resource-constrained environments while retaining strong instruction-following and multilingual capabilities for its size.

Features

Ultra-compact design: 1B parameters for minimal memory and compute requirements.
Instruction-tuned: Capable of following prompts and answering questions reliably.
Multilingual support: Handles a wide set of languages despite small scale.
Edge-ready: Runs efficiently on laptops, mobile devices, and other constrained hardware.

Use Cases

On-device conversational agents and personal assistants.
Educational apps or lightweight tutoring systems.
Prototyping with LLMs in environments where compute or cost is heavily constrained.
Offline or embedded applications where larger models are impractical.

Inputs and Outputs

Input: Text prompts such as questions, instructions, or code snippets.
Output: Concise natural language responses, answers, or explanations.

License

Licensed under Meta Llama 3.2 Community License

References

Model card: https://huggingface.co/meta-llama/Llama-3.2-1B

Downloads last month: 44

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Collection including NexaAI/llama3.2-1B-intel-npu

Intel NPU

Collection

Latest SOTA models supported on Intel NPU • 6 items • Updated 4 days ago