Model Information

This model, derived from Meta’s Llama-3.1-8B-Instruct, has been converted and optimized to run efficiently on Qualcomm Cloud AI 100 hardware. Leveraging Qualcomm's developer-centric toolchain, it incorporates reengineered Transformer components and precision-optimized graph transformations for enhanced performance on-device.

Key Features

  • Optimized LLM Blocks: Includes custom modules to handle intermediate states and precision challenges, ensuring high-performance inference.
  • Transformation Tools: Supports graph modifications to retain model accuracy while improving efficiency through mathematical optimizations.
  • Export Ready: Compatible with ONNX for easy deployment.
  • Comprehensive Testing: Each PR undergoes extensive validation, comparing MSE against the original model.
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model is not currently available via any of the supported third-party Inference Providers, and the model is not deployed on the HF Inference API.

Model tree for Hyratek/Llama-3.1-8B-Instruct-QAIC

Finetuned
(808)
this model

Collection including Hyratek/Llama-3.1-8B-Instruct-QAIC