Update README.md
Browse files
README.md
CHANGED
|
@@ -17,6 +17,9 @@ base_model:
|
|
| 17 |
pipeline_tag: text-generation
|
| 18 |
---
|
| 19 |
|
|
|
|
|
|
|
|
|
|
| 20 |
# Quantization Recipe
|
| 21 |
|
| 22 |
First need to install the required packages:
|
|
|
|
| 17 |
pipeline_tag: text-generation
|
| 18 |
---
|
| 19 |
|
| 20 |
+
[Phi4-mini](https://huggingface.co/microsoft/Phi-4-mini-instruct) is quantized with [torchao](https://huggingface.co/docs/transformers/main/en/quantization/torchao) with 8-bit embeddings, and 8-bit dynamic activation with int4 weights (8da4w), by PyTorch team.
|
| 21 |
+
You can export the quantized model to an [ExecuTorch](https://github.com/pytorch/executorch) pte file, or use the [quantized pte](https://huggingface.co/pytorch/Phi-4-mini-instruct-8da4w/blob/main/phi4-mini-8da4w.pte) file directly to run on a mobile device, see [Running in a mobile app](#running-in-a-mobile-app).
|
| 22 |
+
|
| 23 |
# Quantization Recipe
|
| 24 |
|
| 25 |
First need to install the required packages:
|