YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

Llama-3.2-1B

Run Llama-3.2-1B optimized for Intel NPUs with nexaSDK.

Quickstart

  1. Install nexaSDK and create a free account at sdk.nexa.ai

  2. Activate your device with your access token:

    nexa config set license '<access_token>'
    
  3. Run the model on NPU in one line:

    nexa infer NexaAI/llama3.2-1B-intel-npu
    

Model Description

Llama-3.2-1B is the smallest model in the Llama 3.2 family, optimized for efficiency and ultra-lightweight deployment.
With just 1B parameters, it enables fast inference on resource-constrained environments while retaining strong instruction-following and multilingual capabilities for its size.

Features

  • Ultra-compact design: 1B parameters for minimal memory and compute requirements.
  • Instruction-tuned: Capable of following prompts and answering questions reliably.
  • Multilingual support: Handles a wide set of languages despite small scale.
  • Edge-ready: Runs efficiently on laptops, mobile devices, and other constrained hardware.

Use Cases

  • On-device conversational agents and personal assistants.
  • Educational apps or lightweight tutoring systems.
  • Prototyping with LLMs in environments where compute or cost is heavily constrained.
  • Offline or embedded applications where larger models are impractical.

Inputs and Outputs

Input: Text prompts such as questions, instructions, or code snippets.
Output: Concise natural language responses, answers, or explanations.

License

  • Licensed under Meta Llama 3.2 Community License

References

Downloads last month
29
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Collection including NexaAI/llama3.2-1B-intel-npu