whitphx HF Staff commited on
Commit
ae99577
Β·
verified Β·
1 Parent(s): 185e1b6

Add/update the quantized ONNX model files and README.md for Transformers.js v3

Browse files

## Applied Quantizations

### βœ… Based on `decoder_model.onnx` *with* slimming

↳ βœ… `fp16`: `decoder_model_fp16.onnx` (added)
↳ βœ… `int8`: `decoder_model_int8.onnx` (added)
↳ βœ… `uint8`: `decoder_model_uint8.onnx` (added)
↳ βœ… `q4`: `decoder_model_q4.onnx` (added)
↳ βœ… `q4f16`: `decoder_model_q4f16.onnx` (added)
↳ βœ… `bnb4`: `decoder_model_bnb4.onnx` (added)

### βœ… Based on `decoder_with_past_model.onnx` *with* slimming

↳ βœ… `fp16`: `decoder_with_past_model_fp16.onnx` (added)
↳ βœ… `int8`: `decoder_with_past_model_int8.onnx` (added)
↳ βœ… `uint8`: `decoder_with_past_model_uint8.onnx` (added)
↳ βœ… `q4`: `decoder_with_past_model_q4.onnx` (added)
↳ βœ… `q4f16`: `decoder_with_past_model_q4f16.onnx` (added)
↳ βœ… `bnb4`: `decoder_with_past_model_bnb4.onnx` (added)

### βœ… Based on `decoder_model_merged.onnx` *with* slimming

**The base model `decoder_model_merged.onnx` has been renamed to `model.onnx`.**

↳ βœ… `fp16`: `model_fp16.onnx` (added)
↳ βœ… `int8`: `model_int8.onnx` (added)
↳ βœ… `uint8`: `model_uint8.onnx` (added)
↳ βœ… `q4`: `model_q4.onnx` (added)
↳ βœ… `q4f16`: `model_q4f16.onnx` (added)
↳ βœ… `bnb4`: `model_bnb4.onnx` (added)

.gitattributes CHANGED
@@ -38,3 +38,9 @@ Constant_171_attr__value filter=lfs diff=lfs merge=lfs -text
38
  onnx/decoder_model.onnx_data filter=lfs diff=lfs merge=lfs -text
39
  onnx/decoder_model_merged.onnx_data filter=lfs diff=lfs merge=lfs -text
40
  onnx/decoder_with_past_model.onnx_data filter=lfs diff=lfs merge=lfs -text
 
 
 
 
 
 
 
38
  onnx/decoder_model.onnx_data filter=lfs diff=lfs merge=lfs -text
39
  onnx/decoder_model_merged.onnx_data filter=lfs diff=lfs merge=lfs -text
40
  onnx/decoder_with_past_model.onnx_data filter=lfs diff=lfs merge=lfs -text
41
+ onnx/decoder_model_bnb4.onnx_data filter=lfs diff=lfs merge=lfs -text
42
+ onnx/decoder_model_q4.onnx_data filter=lfs diff=lfs merge=lfs -text
43
+ onnx/decoder_with_past_model_bnb4.onnx_data filter=lfs diff=lfs merge=lfs -text
44
+ onnx/decoder_with_past_model_q4.onnx_data filter=lfs diff=lfs merge=lfs -text
45
+ onnx/model_bnb4.onnx_data filter=lfs diff=lfs merge=lfs -text
46
+ onnx/model_q4.onnx_data filter=lfs diff=lfs merge=lfs -text
README.md CHANGED
@@ -5,4 +5,20 @@ library_name: transformers.js
5
 
6
  https://huggingface.co/MBZUAI/LaMini-GPT-774M with ONNX weights to be compatible with Transformers.js.
7
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
8
  Note: Having a separate repo for ONNX weights is intended to be a temporary solution until WebML gains more traction. If you would like to make your models web-ready, we recommend converting to ONNX using [πŸ€— Optimum](https://huggingface.co/docs/optimum/index) and structuring your repo like this one (with ONNX weights located in a subfolder named `onnx`).
 
5
 
6
  https://huggingface.co/MBZUAI/LaMini-GPT-774M with ONNX weights to be compatible with Transformers.js.
7
 
8
+ ## Usage (Transformers.js)
9
+
10
+ If you haven't already, you can install the [Transformers.js](https://huggingface.co/docs/transformers.js) JavaScript library from [NPM](https://www.npmjs.com/package/@huggingface/transformers) using:
11
+ ```bash
12
+ npm i @huggingface/transformers
13
+ ```
14
+
15
+ **Example:** Text generation.
16
+
17
+ ```js
18
+ import { pipeline } from '@huggingface/transformers';
19
+
20
+ const generator = await pipeline('text-generation', 'Xenova/LaMini-GPT-774M');
21
+ const output = await generator('Once upon a time, there was', { max_new_tokens: 10 });
22
+ ```
23
+
24
  Note: Having a separate repo for ONNX weights is intended to be a temporary solution until WebML gains more traction. If you would like to make your models web-ready, we recommend converting to ONNX using [πŸ€— Optimum](https://huggingface.co/docs/optimum/index) and structuring your repo like this one (with ONNX weights located in a subfolder named `onnx`).
onnx/decoder_model_bnb4.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:535fc0f0d2f7370a8bea4ac0e51794aa508a9458aaf26e499b1a68623efb3256
3
+ size 1100511
onnx/decoder_model_bnb4.onnx_data ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:357c7108942f572794ed8ceed78e8c9e3b42311393a57449a7e267d7214a0f21
3
+ size 3097174016
onnx/decoder_model_fp16.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:714d59f4b9ae68dac2996854858f7733f5982873a9b1a42a1ea93451b0efb433
3
+ size 1550181205
onnx/decoder_model_int8.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:20423774c8d27fca20e10fc4ba1095de919b2968f73ea9e2269796e5fcd356a2
3
+ size 1035491877
onnx/decoder_model_q4.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:0b6c4b9f55da6e508fb124799f7f0c095f5665fd747af3440fedba7b119ba482
3
+ size 1099581
onnx/decoder_model_q4.onnx_data ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:357c7108942f572794ed8ceed78e8c9e3b42311393a57449a7e267d7214a0f21
3
+ size 3097174016
onnx/decoder_model_q4f16.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:dffabc5cb3af28689b0dce6644ed6880abd53517d00347cb58981134763d81e7
3
+ size 1550181224
onnx/decoder_model_uint8.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:5cad481153d70cc683e032f6f36c8c49a963a22997dbaabe96d6bcb33ce96d32
3
+ size 1035491952
onnx/decoder_with_past_model_bnb4.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:ad21e7543c9a550c42ff2f1cb6c1d1621c342740a7958b17782d840987c704e5
3
+ size 1242020
onnx/decoder_with_past_model_bnb4.onnx_data ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:357c7108942f572794ed8ceed78e8c9e3b42311393a57449a7e267d7214a0f21
3
+ size 3097174016
onnx/decoder_with_past_model_fp16.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:7eff39e656bb327f1b9c6f32a198b0e25a71eee4fb2a9cf71e8eb366ec11a45e
3
+ size 1550331615
onnx/decoder_with_past_model_int8.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:edffd2186bb0e25804b0cced6501ef33cd81821a0e087425d614b7c7268cf15d
3
+ size 1035636711
onnx/decoder_with_past_model_q4.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:39797a23f999f1d03a8e6fe22ca544d2ddf52d6c0d4323bc4e51dd9d9b45fb29
3
+ size 1241127
onnx/decoder_with_past_model_q4.onnx_data ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:357c7108942f572794ed8ceed78e8c9e3b42311393a57449a7e267d7214a0f21
3
+ size 3097174016
onnx/decoder_with_past_model_q4f16.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:0e1ce61e8ffe35dbc944e41eeac075d3ee9fb7b0dfb638d077262c1c9f2157af
3
+ size 1550331634
onnx/decoder_with_past_model_uint8.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:30cc339ba30ceaa7fd210f64a003b2e528e268804442c1025de4ad725717363f
3
+ size 1035636786
onnx/model_bnb4.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:9da185201b2bcfa942474e12d16f4496ac0b415e9ca4c3e03815fc213e681b17
3
+ size 2477414
onnx/model_bnb4.onnx_data ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:357c7108942f572794ed8ceed78e8c9e3b42311393a57449a7e267d7214a0f21
3
+ size 3097174016
onnx/model_fp16.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:bd713a21a1c8873fd3162fb4d7ba34d4dc1983150a49cd2d5ed8729f1c75975a
3
+ size 1551580635
onnx/model_int8.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:d32ecd492ec794676315227ce231e35e783ad93f4303b5d70172055f4eb7c591
3
+ size 1037072109
onnx/model_q4.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:2d704a6f193a9120f5ce3a54939f4ea0b3163bee3aced662b5e885bab16f375e
3
+ size 2476495
onnx/model_q4.onnx_data ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:357c7108942f572794ed8ceed78e8c9e3b42311393a57449a7e267d7214a0f21
3
+ size 3097174016
onnx/model_q4f16.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:0559c458653a7340a2dd2239c190c154a93c634b81956bad5101101ada5145d7
3
+ size 1551580654
onnx/model_uint8.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:01c9f287c12fda7625cac0e3367c02b8f911c6ddf7af3721044d95fb1bdf5e86
3
+ size 1037072184