Update README.md
Browse files
README.md
CHANGED
@@ -6,4 +6,18 @@ pipeline_tag: translation
|
|
6 |
tags:
|
7 |
- text2text-generation
|
8 |
- text-generation-inference
|
9 |
-
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
6 |
tags:
|
7 |
- text2text-generation
|
8 |
- text-generation-inference
|
9 |
+
---
|
10 |
+
|
11 |
+
## ONNX version of [google/madlad400-3b-mt](https://huggingface.co/google/madlad400-3b-mt)
|
12 |
+
|
13 |
+
## Converted and quantized with [optimum-cli](https://github.com/huggingface/optimum)
|
14 |
+
|
15 |
+
- Convert to ONNX:
|
16 |
+
```sh
|
17 |
+
optimum-cli onnxruntime export --model google/madlad400-3b-mt <output_path> --legacy
|
18 |
+
```
|
19 |
+
|
20 |
+
- Quantization:
|
21 |
+
```sh
|
22 |
+
optimum-cli onnxruntime quantize --onnx_model <input_model_path> -o <output_model_path> --avx512_vnni
|
23 |
+
```
|