Add/update the quantized ONNX model files and README.md for Transformers.js v3

## Applied Quantizations

### ✅ Based on `vision_model.onnx` *with* slimming

↳ ✅ `int8`: `vision_model_int8.onnx` (added)
↳ ✅ `uint8`: `vision_model_uint8.onnx` (added)
↳ ✅ `q4`: `vision_model_q4.onnx` (added)
↳ ✅ `q4f16`: `vision_model_q4f16.onnx` (added)
↳ ✅ `bnb4`: `vision_model_bnb4.onnx` (added)

### ✅ Based on `vision_model.onnx` *with* slimming

↳ ✅ `int8`: `vision_model_int8.onnx` (added)
↳ ✅ `uint8`: `vision_model_uint8.onnx` (added)
↳ ✅ `q4`: `vision_model_q4.onnx` (added)
↳ ✅ `q4f16`: `vision_model_q4f16.onnx` (added)
↳ ✅ `bnb4`: `vision_model_bnb4.onnx` (added)

### ❌ Based on `model.onnx` *with* slimming

```
None
```
↳ ❌ `int8`: `model_int8.onnx` (added but JS-based E2E test failed)
```
/home/ubuntu/src/tjsmigration/node_modules/.pnpm/[email protected]/node_modules/onnxruntime-node/dist/backend.js:25
__classPrivateFieldGet(this, _OnnxruntimeSessionHandler_inferenceSession, "f").loadModel(pathOrBuffer, options);
^

Error: Could not find an implementation for ConvInteger(10) node with name '/vision_model/embeddings/patch_embedding/Conv_quant'
at new OnnxruntimeSessionHandler (/home/ubuntu/src/tjsmigration/node_modules/.pnpm/[email protected]/node_modules/onnxruntime-node/dist/backend.js:25:92)
at Immediate.<anonymous> (/home/ubuntu/src/tjsmigration/node_modules/.pnpm/[email protected]/node_modules/onnxruntime-node/dist/backend.js:67:29)
at process.processImmediate (node:internal/timers:485:21)

Node.js v22.16.0
```
↳ ✅ `uint8`: `model_uint8.onnx` (added)
↳ ✅ `q4`: `model_q4.onnx` (added)
↳ ✅ `q4f16`: `model_q4f16.onnx` (added)
↳ ✅ `bnb4`: `model_bnb4.onnx` (added)

### ❌ Based on `model.onnx` *with* slimming

```
None
```
↳ ❌ `int8`: `model_int8.onnx` (added but JS-based E2E test failed)
```
/home/ubuntu/src/tjsmigration/node_modules/.pnpm/[email protected]/node_modules/onnxruntime-node/dist/backend.js:25
__classPrivateFieldGet(this, _OnnxruntimeSessionHandler_inferenceSession, "f").loadModel(pathOrBuffer, options);
^

Error: Could not find an implementation for ConvInteger(10) node with name '/vision_model/embeddings/patch_embedding/Conv_quant'
at new OnnxruntimeSessionHandler (/home/ubuntu/src/tjsmigration/node_modules/.pnpm/[email protected]/node_modules/onnxruntime-node/dist/backend.js:25:92)
at Immediate.<anonymous> (/home/ubuntu/src/tjsmigration/node_modules/.pnpm/[email protected]/node_modules/onnxruntime-node/dist/backend.js:67:29)
at process.processImmediate (node:internal/timers:485:21)

Node.js v22.16.0
```
↳ ✅ `uint8`: `model_uint8.onnx` (added)
↳ ✅ `q4`: `model_q4.onnx` (added)
↳ ✅ `q4f16`: `model_q4f16.onnx` (added)
↳ ✅ `bnb4`: `model_bnb4.onnx` (added)

### ✅ Based on `text_model.onnx` *with* slimming

↳ ✅ `int8`: `text_model_int8.onnx` (added)
↳ ✅ `uint8`: `text_model_uint8.onnx` (added)
↳ ✅ `q4`: `text_model_q4.onnx` (added)
↳ ✅ `q4f16`: `text_model_q4f16.onnx` (added)
↳ ✅ `bnb4`: `text_model_bnb4.onnx` (added)

### ✅ Based on `text_model.onnx` *with* slimming

↳ ✅ `int8`: `text_model_int8.onnx` (added)
↳ ✅ `uint8`: `text_model_uint8.onnx` (added)
↳ ✅ `q4`: `text_model_q4.onnx` (added)
↳ ✅ `q4f16`: `text_model_q4f16.onnx` (added)
↳ ✅ `bnb4`: `text_model_bnb4.onnx` (added)

Files changed (15) hide show

README.md +5 -5
onnx/model_bnb4.onnx +3 -0
onnx/model_q4.onnx +3 -0
onnx/model_q4f16.onnx +3 -0
onnx/model_uint8.onnx +3 -0
onnx/text_model_bnb4.onnx +3 -0
onnx/text_model_int8.onnx +3 -0
onnx/text_model_q4.onnx +3 -0
onnx/text_model_q4f16.onnx +3 -0
onnx/text_model_uint8.onnx +3 -0
onnx/vision_model_bnb4.onnx +3 -0
onnx/vision_model_int8.onnx +3 -0
onnx/vision_model_q4.onnx +3 -0
onnx/vision_model_q4f16.onnx +3 -0
onnx/vision_model_uint8.onnx +3 -0

README.md CHANGED Viewed

@@ -7,14 +7,14 @@ https://huggingface.co/google/siglip-base-patch16-384 with ONNX weights to be co
 ## Usage (Transformers.js)
-If you haven't already, you can install the [Transformers.js](https://huggingface.co/docs/transformers.js) JavaScript library from [NPM](https://www.npmjs.com/package/@xenova/transformers) using:
 ```bash
-npm i @xenova/transformers
 ```
 **Example:** Zero-shot image classification w/ `Xenova/siglip-base-patch16-384`:
 ```js
-import { pipeline } from '@xenova/transformers';
 const classifier = await pipeline('zero-shot-image-classification', 'Xenova/siglip-base-patch16-384');
 const url = 'http://images.cocodataset.org/val2017/000000039769.jpg';
@@ -31,7 +31,7 @@ console.log(output);
 **Example:** Compute text embeddings with `SiglipTextModel`.
 ```javascript
-import { AutoTokenizer, SiglipTextModel } from '@xenova/transformers';
 // Load tokenizer and text model
 const tokenizer = await AutoTokenizer.from_pretrained('Xenova/siglip-base-patch16-384');
@@ -54,7 +54,7 @@ const { pooler_output } = await text_model(text_inputs);
 **Example:** Compute vision embeddings with `SiglipVisionModel`.
 ```javascript
-import { AutoProcessor, SiglipVisionModel, RawImage} from '@xenova/transformers';
 // Load processor and vision model
 const processor = await AutoProcessor.from_pretrained('Xenova/siglip-base-patch16-384');

 ## Usage (Transformers.js)
+If you haven't already, you can install the [Transformers.js](https://huggingface.co/docs/transformers.js) JavaScript library from [NPM](https://www.npmjs.com/package/@huggingface/transformers) using:
 ```bash
+npm i @huggingface/transformers
 ```
 **Example:** Zero-shot image classification w/ `Xenova/siglip-base-patch16-384`:
 ```js
+import { pipeline } from '@huggingface/transformers';
 const classifier = await pipeline('zero-shot-image-classification', 'Xenova/siglip-base-patch16-384');
 const url = 'http://images.cocodataset.org/val2017/000000039769.jpg';
 **Example:** Compute text embeddings with `SiglipTextModel`.
 ```javascript
+import { AutoTokenizer, SiglipTextModel } from '@huggingface/transformers';
 // Load tokenizer and text model
 const tokenizer = await AutoTokenizer.from_pretrained('Xenova/siglip-base-patch16-384');
 **Example:** Compute vision embeddings with `SiglipVisionModel`.
 ```javascript
+import { AutoProcessor, SiglipVisionModel, RawImage} from '@huggingface/transformers';
 // Load processor and vision model
 const processor = await AutoProcessor.from_pretrained('Xenova/siglip-base-patch16-384');

onnx/model_bnb4.onnx ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:b8cde46b18e22e251ceaa5a7a8cdb9cec5eb7cf1d0888fce14b9d6bef528063d
+size 208111514

onnx/model_q4.onnx ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:e851b553e5adc8d790b482db67d740291a68fcc84002cc6ff26e0c48069fd82d
+size 219132763

onnx/model_q4f16.onnx ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:8945e90e02ebbadfeb6937995cfaa9542fc6e61272e8b53d73cf00286df6ccb0
+size 153929719

onnx/model_uint8.onnx ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:b3732f027ecf278adc92b991f9e48fa00f8f5dc248c7366e67f0b55096d3675c
+size 206258669

onnx/text_model_bnb4.onnx ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:ed33d7aea430837f6c2f12d51d4f15c851ed7a71a6ab92a6bf6d7852f023d440
+size 149385374

onnx/text_model_int8.onnx ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:881fec3fd70dd7163cb2b187603b8e375887a27784ebf1e35fa9d42e73ef91a7
+size 110982746

onnx/text_model_q4.onnx ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:960171024e219926f0341009fff19286849fd1f744c397ae5e9381e41e2adf3f
+size 154693262

onnx/text_model_q4f16.onnx ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:d516de20cb0c17cffd3bdcc148fe3b4a7e167e18f99463c7a149c91247ed8cf6
+size 98710743

onnx/text_model_uint8.onnx ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:7552b638ca58936efcb1fd4cd478fce6cc5213e53b344f187d5fa27f8f0174e0
+size 110982789

onnx/vision_model_bnb4.onnx ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:24b9afeba00fc67e738a6b7770799b0ba82980ff5356b306b0c5593be618f707
+size 58716113

onnx/vision_model_int8.onnx ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:1c1cb4265d2808c1ea2d3a78cbf678c720281fd7076c98ef5893d7a4384586f5
+size 95265676

onnx/vision_model_q4.onnx ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:be5f8738097053e86f48c2a6edbdde089090c332f5c4fcb9c6d802e9ccbd080b
+size 64429474

onnx/vision_model_q4f16.onnx ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:14e5676080704fff7847b574414fd44e144b124cf4bd65ea1c5366eea35b6400
+size 55209019

onnx/vision_model_uint8.onnx ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:09879257d680b168941827e3ba656184a389be08ea3164f96c3ce1a8f52f62be
+size 95265725