Add/update the quantized ONNX model files and README.md for Transformers.js v3
Browse files## Applied Quantizations
### β
Based on `decoder_model.onnx` *with* slimming
β³ β
`fp16`: `decoder_model_fp16.onnx` (added)
β³ β
`int8`: `decoder_model_int8.onnx` (added)
β³ β
`uint8`: `decoder_model_uint8.onnx` (added)
β³ β
`q4`: `decoder_model_q4.onnx` (added)
β³ β
`q4f16`: `decoder_model_q4f16.onnx` (added)
β³ β
`bnb4`: `decoder_model_bnb4.onnx` (added)
### β
Based on `decoder_with_past_model.onnx` *with* slimming
β³ β
`fp16`: `decoder_with_past_model_fp16.onnx` (added)
β³ β
`int8`: `decoder_with_past_model_int8.onnx` (added)
β³ β
`uint8`: `decoder_with_past_model_uint8.onnx` (added)
β³ β
`q4`: `decoder_with_past_model_q4.onnx` (added)
β³ β
`q4f16`: `decoder_with_past_model_q4f16.onnx` (added)
β³ β
`bnb4`: `decoder_with_past_model_bnb4.onnx` (added)
### β
Based on `decoder_model_merged.onnx` *with* slimming
**The base model `decoder_model_merged.onnx` has been renamed to `model.onnx`.**
β³ β
`fp16`: `model_fp16.onnx` (added)
β³ β
`int8`: `model_int8.onnx` (added)
β³ β
`uint8`: `model_uint8.onnx` (added)
β³ β
`q4`: `model_q4.onnx` (added)
β³ β
`q4f16`: `model_q4f16.onnx` (added)
β³ β
`bnb4`: `model_bnb4.onnx` (added)
- .gitattributes +6 -0
- README.md +16 -0
- onnx/decoder_model_bnb4.onnx +3 -0
- onnx/decoder_model_bnb4.onnx_data +3 -0
- onnx/decoder_model_fp16.onnx +3 -0
- onnx/decoder_model_int8.onnx +3 -0
- onnx/decoder_model_q4.onnx +3 -0
- onnx/decoder_model_q4.onnx_data +3 -0
- onnx/decoder_model_q4f16.onnx +3 -0
- onnx/decoder_model_uint8.onnx +3 -0
- onnx/decoder_with_past_model_bnb4.onnx +3 -0
- onnx/decoder_with_past_model_bnb4.onnx_data +3 -0
- onnx/decoder_with_past_model_fp16.onnx +3 -0
- onnx/decoder_with_past_model_int8.onnx +3 -0
- onnx/decoder_with_past_model_q4.onnx +3 -0
- onnx/decoder_with_past_model_q4.onnx_data +3 -0
- onnx/decoder_with_past_model_q4f16.onnx +3 -0
- onnx/decoder_with_past_model_uint8.onnx +3 -0
- onnx/model_bnb4.onnx +3 -0
- onnx/model_bnb4.onnx_data +3 -0
- onnx/model_fp16.onnx +3 -0
- onnx/model_int8.onnx +3 -0
- onnx/model_q4.onnx +3 -0
- onnx/model_q4.onnx_data +3 -0
- onnx/model_q4f16.onnx +3 -0
- onnx/model_uint8.onnx +3 -0
@@ -38,3 +38,9 @@ Constant_171_attr__value filter=lfs diff=lfs merge=lfs -text
|
|
38 |
onnx/decoder_model.onnx_data filter=lfs diff=lfs merge=lfs -text
|
39 |
onnx/decoder_model_merged.onnx_data filter=lfs diff=lfs merge=lfs -text
|
40 |
onnx/decoder_with_past_model.onnx_data filter=lfs diff=lfs merge=lfs -text
|
|
|
|
|
|
|
|
|
|
|
|
|
|
38 |
onnx/decoder_model.onnx_data filter=lfs diff=lfs merge=lfs -text
|
39 |
onnx/decoder_model_merged.onnx_data filter=lfs diff=lfs merge=lfs -text
|
40 |
onnx/decoder_with_past_model.onnx_data filter=lfs diff=lfs merge=lfs -text
|
41 |
+
onnx/decoder_model_bnb4.onnx_data filter=lfs diff=lfs merge=lfs -text
|
42 |
+
onnx/decoder_model_q4.onnx_data filter=lfs diff=lfs merge=lfs -text
|
43 |
+
onnx/decoder_with_past_model_bnb4.onnx_data filter=lfs diff=lfs merge=lfs -text
|
44 |
+
onnx/decoder_with_past_model_q4.onnx_data filter=lfs diff=lfs merge=lfs -text
|
45 |
+
onnx/model_bnb4.onnx_data filter=lfs diff=lfs merge=lfs -text
|
46 |
+
onnx/model_q4.onnx_data filter=lfs diff=lfs merge=lfs -text
|
@@ -5,4 +5,20 @@ library_name: transformers.js
|
|
5 |
|
6 |
https://huggingface.co/MBZUAI/LaMini-GPT-774M with ONNX weights to be compatible with Transformers.js.
|
7 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
8 |
Note: Having a separate repo for ONNX weights is intended to be a temporary solution until WebML gains more traction. If you would like to make your models web-ready, we recommend converting to ONNX using [π€ Optimum](https://huggingface.co/docs/optimum/index) and structuring your repo like this one (with ONNX weights located in a subfolder named `onnx`).
|
|
|
5 |
|
6 |
https://huggingface.co/MBZUAI/LaMini-GPT-774M with ONNX weights to be compatible with Transformers.js.
|
7 |
|
8 |
+
## Usage (Transformers.js)
|
9 |
+
|
10 |
+
If you haven't already, you can install the [Transformers.js](https://huggingface.co/docs/transformers.js) JavaScript library from [NPM](https://www.npmjs.com/package/@huggingface/transformers) using:
|
11 |
+
```bash
|
12 |
+
npm i @huggingface/transformers
|
13 |
+
```
|
14 |
+
|
15 |
+
**Example:** Text generation.
|
16 |
+
|
17 |
+
```js
|
18 |
+
import { pipeline } from '@huggingface/transformers';
|
19 |
+
|
20 |
+
const generator = await pipeline('text-generation', 'Xenova/LaMini-GPT-774M');
|
21 |
+
const output = await generator('Once upon a time, there was', { max_new_tokens: 10 });
|
22 |
+
```
|
23 |
+
|
24 |
Note: Having a separate repo for ONNX weights is intended to be a temporary solution until WebML gains more traction. If you would like to make your models web-ready, we recommend converting to ONNX using [π€ Optimum](https://huggingface.co/docs/optimum/index) and structuring your repo like this one (with ONNX weights located in a subfolder named `onnx`).
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:535fc0f0d2f7370a8bea4ac0e51794aa508a9458aaf26e499b1a68623efb3256
|
3 |
+
size 1100511
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:357c7108942f572794ed8ceed78e8c9e3b42311393a57449a7e267d7214a0f21
|
3 |
+
size 3097174016
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:714d59f4b9ae68dac2996854858f7733f5982873a9b1a42a1ea93451b0efb433
|
3 |
+
size 1550181205
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:20423774c8d27fca20e10fc4ba1095de919b2968f73ea9e2269796e5fcd356a2
|
3 |
+
size 1035491877
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:0b6c4b9f55da6e508fb124799f7f0c095f5665fd747af3440fedba7b119ba482
|
3 |
+
size 1099581
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:357c7108942f572794ed8ceed78e8c9e3b42311393a57449a7e267d7214a0f21
|
3 |
+
size 3097174016
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:dffabc5cb3af28689b0dce6644ed6880abd53517d00347cb58981134763d81e7
|
3 |
+
size 1550181224
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:5cad481153d70cc683e032f6f36c8c49a963a22997dbaabe96d6bcb33ce96d32
|
3 |
+
size 1035491952
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:ad21e7543c9a550c42ff2f1cb6c1d1621c342740a7958b17782d840987c704e5
|
3 |
+
size 1242020
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:357c7108942f572794ed8ceed78e8c9e3b42311393a57449a7e267d7214a0f21
|
3 |
+
size 3097174016
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:7eff39e656bb327f1b9c6f32a198b0e25a71eee4fb2a9cf71e8eb366ec11a45e
|
3 |
+
size 1550331615
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:edffd2186bb0e25804b0cced6501ef33cd81821a0e087425d614b7c7268cf15d
|
3 |
+
size 1035636711
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:39797a23f999f1d03a8e6fe22ca544d2ddf52d6c0d4323bc4e51dd9d9b45fb29
|
3 |
+
size 1241127
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:357c7108942f572794ed8ceed78e8c9e3b42311393a57449a7e267d7214a0f21
|
3 |
+
size 3097174016
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:0e1ce61e8ffe35dbc944e41eeac075d3ee9fb7b0dfb638d077262c1c9f2157af
|
3 |
+
size 1550331634
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:30cc339ba30ceaa7fd210f64a003b2e528e268804442c1025de4ad725717363f
|
3 |
+
size 1035636786
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:9da185201b2bcfa942474e12d16f4496ac0b415e9ca4c3e03815fc213e681b17
|
3 |
+
size 2477414
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:357c7108942f572794ed8ceed78e8c9e3b42311393a57449a7e267d7214a0f21
|
3 |
+
size 3097174016
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:bd713a21a1c8873fd3162fb4d7ba34d4dc1983150a49cd2d5ed8729f1c75975a
|
3 |
+
size 1551580635
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:d32ecd492ec794676315227ce231e35e783ad93f4303b5d70172055f4eb7c591
|
3 |
+
size 1037072109
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:2d704a6f193a9120f5ce3a54939f4ea0b3163bee3aced662b5e885bab16f375e
|
3 |
+
size 2476495
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:357c7108942f572794ed8ceed78e8c9e3b42311393a57449a7e267d7214a0f21
|
3 |
+
size 3097174016
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:0559c458653a7340a2dd2239c190c154a93c634b81956bad5101101ada5145d7
|
3 |
+
size 1551580654
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:01c9f287c12fda7625cac0e3367c02b8f911c6ddf7af3721044d95fb1bdf5e86
|
3 |
+
size 1037072184
|