whitphx HF Staff commited on
Commit
c6aa833
Β·
verified Β·
1 Parent(s): 72b1555

Add/update the quantized ONNX model files and README.md for Transformers.js v3

Browse files

## Applied Quantizations

### βœ… Based on `vision_model.onnx` *with* slimming

↳ βœ… `int8`: `vision_model_int8.onnx` (added)
↳ βœ… `uint8`: `vision_model_uint8.onnx` (added)
↳ βœ… `q4`: `vision_model_q4.onnx` (added)
↳ βœ… `q4f16`: `vision_model_q4f16.onnx` (added)
↳ βœ… `bnb4`: `vision_model_bnb4.onnx` (added)

### βœ… Based on `vision_model.onnx` *with* slimming

↳ βœ… `int8`: `vision_model_int8.onnx` (added)
↳ βœ… `uint8`: `vision_model_uint8.onnx` (added)
↳ βœ… `q4`: `vision_model_q4.onnx` (added)
↳ βœ… `q4f16`: `vision_model_q4f16.onnx` (added)
↳ βœ… `bnb4`: `vision_model_bnb4.onnx` (added)

### ❌ Based on `model.onnx` *with* slimming

```
None
```
↳ ❌ `int8`: `model_int8.onnx` (added but JS-based E2E test failed)
```
/home/ubuntu/src/tjsmigration/node_modules/.pnpm/[email protected]/node_modules/onnxruntime-node/dist/backend.js:25
__classPrivateFieldGet(this, _OnnxruntimeSessionHandler_inferenceSession, "f").loadModel(pathOrBuffer, options);
^

Error: Could not find an implementation for ConvInteger(10) node with name '/vision_model/embeddings/patch_embedding/Conv_quant'
at new OnnxruntimeSessionHandler (/home/ubuntu/src/tjsmigration/node_modules/.pnpm/[email protected]/node_modules/onnxruntime-node/dist/backend.js:25:92)
at Immediate.<anonymous> (/home/ubuntu/src/tjsmigration/node_modules/.pnpm/[email protected]/node_modules/onnxruntime-node/dist/backend.js:67:29)
at process.processImmediate (node:internal/timers:485:21)

Node.js v22.16.0
```
↳ βœ… `uint8`: `model_uint8.onnx` (added)
↳ βœ… `q4`: `model_q4.onnx` (added)
↳ βœ… `q4f16`: `model_q4f16.onnx` (added)
↳ βœ… `bnb4`: `model_bnb4.onnx` (added)

### ❌ Based on `model.onnx` *with* slimming

```
None
```
↳ ❌ `int8`: `model_int8.onnx` (added but JS-based E2E test failed)
```
/home/ubuntu/src/tjsmigration/node_modules/.pnpm/[email protected]/node_modules/onnxruntime-node/dist/backend.js:25
__classPrivateFieldGet(this, _OnnxruntimeSessionHandler_inferenceSession, "f").loadModel(pathOrBuffer, options);
^

Error: Could not find an implementation for ConvInteger(10) node with name '/vision_model/embeddings/patch_embedding/Conv_quant'
at new OnnxruntimeSessionHandler (/home/ubuntu/src/tjsmigration/node_modules/.pnpm/[email protected]/node_modules/onnxruntime-node/dist/backend.js:25:92)
at Immediate.<anonymous> (/home/ubuntu/src/tjsmigration/node_modules/.pnpm/[email protected]/node_modules/onnxruntime-node/dist/backend.js:67:29)
at process.processImmediate (node:internal/timers:485:21)

Node.js v22.16.0
```
↳ βœ… `uint8`: `model_uint8.onnx` (added)
↳ βœ… `q4`: `model_q4.onnx` (added)
↳ βœ… `q4f16`: `model_q4f16.onnx` (added)
↳ βœ… `bnb4`: `model_bnb4.onnx` (added)

### βœ… Based on `text_model.onnx` *with* slimming

↳ βœ… `int8`: `text_model_int8.onnx` (added)
↳ βœ… `uint8`: `text_model_uint8.onnx` (added)
↳ βœ… `q4`: `text_model_q4.onnx` (added)
↳ βœ… `q4f16`: `text_model_q4f16.onnx` (added)
↳ βœ… `bnb4`: `text_model_bnb4.onnx` (added)

### βœ… Based on `text_model.onnx` *with* slimming

↳ βœ… `int8`: `text_model_int8.onnx` (added)
↳ βœ… `uint8`: `text_model_uint8.onnx` (added)
↳ βœ… `q4`: `text_model_q4.onnx` (added)
↳ βœ… `q4f16`: `text_model_q4f16.onnx` (added)
↳ βœ… `bnb4`: `text_model_bnb4.onnx` (added)

README.md CHANGED
@@ -5,4 +5,21 @@ library_name: transformers.js
5
 
6
  https://huggingface.co/openai/clip-vit-large-patch14 with ONNX weights to be compatible with Transformers.js.
7
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
8
  Note: Having a separate repo for ONNX weights is intended to be a temporary solution until WebML gains more traction. If you would like to make your models web-ready, we recommend converting to ONNX using [πŸ€— Optimum](https://huggingface.co/docs/optimum/index) and structuring your repo like this one (with ONNX weights located in a subfolder named `onnx`).
 
5
 
6
  https://huggingface.co/openai/clip-vit-large-patch14 with ONNX weights to be compatible with Transformers.js.
7
 
8
+ ## Usage (Transformers.js)
9
+
10
+ If you haven't already, you can install the [Transformers.js](https://huggingface.co/docs/transformers.js) JavaScript library from [NPM](https://www.npmjs.com/package/@huggingface/transformers) using:
11
+ ```bash
12
+ npm i @huggingface/transformers
13
+ ```
14
+
15
+ **Example:** Zero shot image classification.
16
+
17
+ ```js
18
+ import { pipeline } from '@huggingface/transformers';
19
+
20
+ const classifier = await pipeline('zero-shot-image-classification', 'Xenova/clip-vit-large-patch14');
21
+ const url = 'https://huggingface.co/datasets/Xenova/transformers.js-docs/resolve/main/tiger.jpg';
22
+ const output = await classifier(url, ['tiger', 'horse', 'dog']);
23
+ ```
24
+
25
  Note: Having a separate repo for ONNX weights is intended to be a temporary solution until WebML gains more traction. If you would like to make your models web-ready, we recommend converting to ONNX using [πŸ€— Optimum](https://huggingface.co/docs/optimum/index) and structuring your repo like this one (with ONNX weights located in a subfolder named `onnx`).
onnx/model_bnb4.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:ade112d138d094481ed6c004b7d6ca632a028246773205b9bce0f105f43d475e
3
+ size 376476226
onnx/model_q4.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:3e4ea452b20ff948b53892c519252e67501bb4ee0c4a0a91e1df81e47bf6fba8
3
+ size 400743308
onnx/model_q4f16.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:2099b3bb1eb602954f6e2c3ae04f1af8680c552bf0cc957ff0626dcd471bf2d1
3
+ size 297843652
onnx/model_uint8.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:2e444a2acda87e445c8418f1b881e592c4448270c9248de1aaba6c658c8a7593
3
+ size 430790905
onnx/text_model_bnb4.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:867483dfb170ab9b92f7ebf5c40d02f9bb551ed21694517c1e4c784b09f11987
3
+ size 200927902
onnx/text_model_int8.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:10b25d20d2649b01f9cf244c741849c7d72fb15b9fe1ff594c11bbb78b1e18dc
3
+ size 124414449
onnx/text_model_q4.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:d8e9252842c59ee28957f6ae30c3c69a5f7376710dc8e71fd9530e4a8deadb0e
3
+ size 206272647
onnx/text_model_q4f16.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:664ff30b5aec9a7733cc4ff5cdd78747723657e78a56a8c6175da767731a386c
3
+ size 124675686
onnx/text_model_uint8.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:077a7465c2ba220652730ecbe6dba495d83f636020dff15dc2ebbc11bf07c5ba
3
+ size 124414483
onnx/vision_model_bnb4.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:38f632fef571a8a60d1184e26082c1066301fb841227e9d7d746d10fa7a5f0ce
3
+ size 175521006
onnx/vision_model_int8.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:d6eac98fc74a89ef2e50ba635c6b4c7f005dd324dcc6aeb6be89872f58d7b3fd
3
+ size 306348752
onnx/vision_model_q4.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:44b58d41e61a6307ca4984ab5029ea28f3a1669a1d2b189d478dfbe8d86bc192
3
+ size 194443343
onnx/vision_model_q4f16.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:46c54f1fdabca466ed8f02967347d5070c8a05211f93f2e390140befc26c441f
3
+ size 173140193
onnx/vision_model_uint8.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:1bffe274bab282da1c079be283a63a065abaaa94dcf44516029c46dbd98ecf8a
3
+ size 306348826