This model is an ExecuTorch-compatible, quantized variant of Meta’s Llama 3.2 3B Instruct, using the official SpinQuant INT4/EO8 format provided by Meta for efficient on-device inference. Based on: Meta Llama 3.2 Quantization method: SpinQuant (INT4 weights / EO8 activations) by Meta For more information, refer to the [Llama 3.2 Model Card](https://huggingface.co/meta-llama/Llama-3.2-3B-Instruct-SpinQuant_INT4_EO8).