SpiridonSunRotator's picture
Round to 2 decimal digits
269ad6e verified
---
license: llama3.1
library_name: transformers
pipeline_tag: image-text-to-text
tags:
- int4
- vllm
- llmcompressor
base_model:
- meta-llama/Llama-3.1-8B-Instruct
---
# Llama-3.1-8B-Instruct-MR-GPTQ-mxfp
## Model Overview
This model was obtained by quantizing the weights of [Llama-3.1-8B-Instruct](https://huggingface.co/meta-llama/Llama-3.1-8B-Instruct) to MXFP4 data type. This optimization reduces the number of bits per parameter from 16 to 4.25, reducing the disk size and GPU memory requirements by approximately 73%.
## Usage
*MR-GPTQ* quantized models with [QuTLASS](https://github.com/IST-DASLab/qutlass) kernels are supported in the following integrations:
- `transformers` with these features:
- Available in `main` ([Documentation](https://huggingface.co/docs/transformers/main/en/quantization/fp_quant#fp-quant)).
- RTN on-the-fly quantization.
- Pseudo-quantization QAT.
- `vLLM` with these features:
- Available in [this PR](https://github.com/vllm-project/vllm/pull/24440).
- Compatible with real quantization models from `FP-Quant` and the `transformers` integration.
## Evaluation
This model was evaluated on a subset of OpenLLM v1 benchmarks and Platinum bench. Model outputs were generated with the `vLLM` engine.
*OpenLLM v1 results*
| Model | MMLU‑CoT | GSM8k | Hellaswag | Winogrande | **Average** | **Recovery (%)** |
|--------------------------------------------------------------------------------------------------|--------:|------:|----------:|-----------:|------------:|-----------------:|
| `meta‑llama/Llama 3.1‑8B‑Instruct` | 0.7276 | 0.8506 | 0.8001 | 0.7790 | 0.7893 | – |
| `ISTA‑DASLab/Llama‑3.1‑8B‑Instruct‑MR‑GPTQ‑mxfp` | 0.6754 | 0.7892 | 0.7737 | 0.7324 | 0.7427 | 94.09 |
*Platinum bench results*
Below we report recoveries on individual tasks as well as the average recovery.
**Recovery by Task**
| Task | Recovery (%) |
|------|--------------|
| SingleOp | 97.94 |
| SingleQ | 95.95 |
| MultiArith | 98.22 |
| SVAMP | 95.08 |
| GSM8K | 93.69 |
| MMLU-Math | 80.54 |
| BBH-LogicalDeduction-3Obj | 89.87 |
| BBH-ObjectCounting | 82.03 |
| BBH-Navigate | 90.66 |
| TabFact | 86.92 |
| HotpotQA | 96.81 |
| SQuAD | 98.46 |
| DROP | 94.33 |
| Winograd-WSC | 89.47 |
| Average | **92.14** |