Add/update the quantized ONNX model files and README.md for Transformers.js v3

## Applied Quantizations

### ✅ Based on `vision_model.onnx` *with* slimming

↳ ✅ `int8`: `vision_model_int8.onnx` (added)
↳ ✅ `uint8`: `vision_model_uint8.onnx` (added)
↳ ✅ `q4`: `vision_model_q4.onnx` (added)
↳ ✅ `q4f16`: `vision_model_q4f16.onnx` (added)
↳ ✅ `bnb4`: `vision_model_bnb4.onnx` (added)

### ✅ Based on `vision_model.onnx` *with* slimming

↳ ✅ `int8`: `vision_model_int8.onnx` (added)
↳ ✅ `uint8`: `vision_model_uint8.onnx` (added)
↳ ✅ `q4`: `vision_model_q4.onnx` (added)
↳ ✅ `q4f16`: `vision_model_q4f16.onnx` (added)
↳ ✅ `bnb4`: `vision_model_bnb4.onnx` (added)

### ❌ Based on `model.onnx` *with* slimming

```
None
```
↳ ❌ `int8`: `model_int8.onnx` (added but JS-based E2E test failed)
```
/home/ubuntu/src/tjsmigration/node_modules/.pnpm/[email protected]/node_modules/onnxruntime-node/dist/backend.js:25
__classPrivateFieldGet(this, _OnnxruntimeSessionHandler_inferenceSession, "f").loadModel(pathOrBuffer, options);
^

Error: Could not find an implementation for ConvInteger(10) node with name '/vision_model/embeddings/patch_embedding/Conv_quant'
at new OnnxruntimeSessionHandler (/home/ubuntu/src/tjsmigration/node_modules/.pnpm/[email protected]/node_modules/onnxruntime-node/dist/backend.js:25:92)
at Immediate.<anonymous> (/home/ubuntu/src/tjsmigration/node_modules/.pnpm/[email protected]/node_modules/onnxruntime-node/dist/backend.js:67:29)
at process.processImmediate (node:internal/timers:485:21)

Node.js v22.16.0
```
↳ ✅ `uint8`: `model_uint8.onnx` (added)
↳ ✅ `q4`: `model_q4.onnx` (added)
↳ ✅ `q4f16`: `model_q4f16.onnx` (added)
↳ ✅ `bnb4`: `model_bnb4.onnx` (added)

### ❌ Based on `model.onnx` *with* slimming

```
None
```
↳ ❌ `int8`: `model_int8.onnx` (added but JS-based E2E test failed)
```
/home/ubuntu/src/tjsmigration/node_modules/.pnpm/[email protected]/node_modules/onnxruntime-node/dist/backend.js:25
__classPrivateFieldGet(this, _OnnxruntimeSessionHandler_inferenceSession, "f").loadModel(pathOrBuffer, options);
^

Error: Could not find an implementation for ConvInteger(10) node with name '/vision_model/embeddings/patch_embedding/Conv_quant'
at new OnnxruntimeSessionHandler (/home/ubuntu/src/tjsmigration/node_modules/.pnpm/[email protected]/node_modules/onnxruntime-node/dist/backend.js:25:92)
at Immediate.<anonymous> (/home/ubuntu/src/tjsmigration/node_modules/.pnpm/[email protected]/node_modules/onnxruntime-node/dist/backend.js:67:29)
at process.processImmediate (node:internal/timers:485:21)

Node.js v22.16.0
```
↳ ✅ `uint8`: `model_uint8.onnx` (added)
↳ ✅ `q4`: `model_q4.onnx` (added)
↳ ✅ `q4f16`: `model_q4f16.onnx` (added)
↳ ✅ `bnb4`: `model_bnb4.onnx` (added)

### ✅ Based on `text_model.onnx` *with* slimming

↳ ✅ `int8`: `text_model_int8.onnx` (added)
↳ ✅ `uint8`: `text_model_uint8.onnx` (added)
↳ ✅ `q4`: `text_model_q4.onnx` (added)
↳ ✅ `q4f16`: `text_model_q4f16.onnx` (added)
↳ ✅ `bnb4`: `text_model_bnb4.onnx` (added)

### ✅ Based on `text_model.onnx` *with* slimming

↳ ✅ `int8`: `text_model_int8.onnx` (added)
↳ ✅ `uint8`: `text_model_uint8.onnx` (added)
↳ ✅ `q4`: `text_model_q4.onnx` (added)
↳ ✅ `q4f16`: `text_model_q4f16.onnx` (added)
↳ ✅ `bnb4`: `text_model_bnb4.onnx` (added)

Files changed (15) hide show

README.md +5 -5
onnx/model_bnb4.onnx +3 -0
onnx/model_q4.onnx +3 -0
onnx/model_q4f16.onnx +3 -0
onnx/model_uint8.onnx +3 -0
onnx/text_model_bnb4.onnx +3 -0
onnx/text_model_int8.onnx +3 -0
onnx/text_model_q4.onnx +3 -0
onnx/text_model_q4f16.onnx +3 -0
onnx/text_model_uint8.onnx +3 -0
onnx/vision_model_bnb4.onnx +3 -0
onnx/vision_model_int8.onnx +3 -0
onnx/vision_model_q4.onnx +3 -0
onnx/vision_model_q4f16.onnx +3 -0
onnx/vision_model_uint8.onnx +3 -0

README.md CHANGED Viewed

@@ -8,14 +8,14 @@ https://huggingface.co/google/siglip-base-patch16-224 with ONNX weights to be co
 ## Usage (Transformers.js)
-If you haven't already, you can install the [Transformers.js](https://huggingface.co/docs/transformers.js) JavaScript library from [NPM](https://www.npmjs.com/package/@xenova/transformers) using:
 ```bash
-npm i @xenova/transformers
 ```
 **Example:** Zero-shot image classification w/ `Xenova/siglip-base-patch16-224`:
 ```js
-import { pipeline } from '@xenova/transformers';
 const classifier = await pipeline('zero-shot-image-classification', 'Xenova/siglip-base-patch16-224');
 const url = 'http://images.cocodataset.org/val2017/000000039769.jpg';
@@ -32,7 +32,7 @@ console.log(output);
 **Example:** Compute text embeddings with `SiglipTextModel`.
 ```javascript
-import { AutoTokenizer, SiglipTextModel } from '@xenova/transformers';
 // Load tokenizer and text model
 const tokenizer = await AutoTokenizer.from_pretrained('Xenova/siglip-base-patch16-224');
@@ -55,7 +55,7 @@ const { pooler_output } = await text_model(text_inputs);
 **Example:** Compute vision embeddings with `SiglipVisionModel`.
 ```javascript
-import { AutoProcessor, SiglipVisionModel, RawImage} from '@xenova/transformers';
 // Load processor and vision model
 const processor = await AutoProcessor.from_pretrained('Xenova/siglip-base-patch16-224');

 ## Usage (Transformers.js)
+If you haven't already, you can install the [Transformers.js](https://huggingface.co/docs/transformers.js) JavaScript library from [NPM](https://www.npmjs.com/package/@huggingface/transformers) using:
 ```bash
+npm i @huggingface/transformers
 ```
 **Example:** Zero-shot image classification w/ `Xenova/siglip-base-patch16-224`:
 ```js
+import { pipeline } from '@huggingface/transformers';
 const classifier = await pipeline('zero-shot-image-classification', 'Xenova/siglip-base-patch16-224');
 const url = 'http://images.cocodataset.org/val2017/000000039769.jpg';
 **Example:** Compute text embeddings with `SiglipTextModel`.
 ```javascript
+import { AutoTokenizer, SiglipTextModel } from '@huggingface/transformers';
 // Load tokenizer and text model
 const tokenizer = await AutoTokenizer.from_pretrained('Xenova/siglip-base-patch16-224');
 **Example:** Compute vision embeddings with `SiglipVisionModel`.
 ```javascript
+import { AutoProcessor, SiglipVisionModel, RawImage} from '@huggingface/transformers';
 // Load processor and vision model
 const processor = await AutoProcessor.from_pretrained('Xenova/siglip-base-patch16-224');

onnx/model_bnb4.onnx ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:65b4fd0b4bcccacfc4aa4de3a312cb449dbc1fcc77f1c76c76473baacece68bb
+size 206944154

onnx/model_q4.onnx ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:35083107956da5d1c7a95e30fe89e19a1589c36dc940b75725200a5e72a37b03
+size 217965403

onnx/model_q4f16.onnx ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:40dffd94cfa63d3e6543f656e03dfd829d31df9d77ffd5eb86dbd3a924aa29cf
+size 153346039

onnx/model_uint8.onnx ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:ed797eedb1c48d6c067f1af2045cc3a3e99b1d8791e642445a49c512bbca074b
+size 205091308

onnx/text_model_bnb4.onnx ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:fde60e1d2e4b02ade796d82272afe7824cd9d507f64038805846ea4a01498f66
+size 149385374

onnx/text_model_int8.onnx ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:9cb5102160e2a2b90c0a999ed6c2b4090865c9d5aa08f09cd10993ad9b38bd5f
+size 110982746

onnx/text_model_q4.onnx ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:584d4cd37ba666da9af721ab4afae0cdccea17bc9a373cb3da3a18fa6204b85b
+size 154693262

onnx/text_model_q4f16.onnx ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:195b9dd5173188e8bd324e90d903d54e85e68a0ff29e1d4e79a7d8a0da053335
+size 98710743

onnx/text_model_uint8.onnx ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:b671f0b2147f4830ea60ec1094b41c3e0551b475f28a5d4b5fb91e6778109f86
+size 110982791

onnx/vision_model_bnb4.onnx ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:49ce142f2dd6fd481f97e1c739c048c8c79ab5cbc0e63dbea649bb17eddc3752
+size 57548753

onnx/vision_model_int8.onnx ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:6d00762fcb4aef9bdee1b886fcccc7df466f9eae321e180f97d88e24fa13ac72
+size 94098316

onnx/vision_model_q4.onnx ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:ffb4672d02e3e5995f1cb3bf7308c459017014b11a4b129b2762c6872ffb2524
+size 63262114

onnx/vision_model_q4f16.onnx ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:c9eb383bddf748212d240d90ca765415137d7994c60dd8be97bdb3adad30c682
+size 54625339

onnx/vision_model_uint8.onnx ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:ee55b2cf8a596650a2912b9e564ff4b50119228e19ae68d6fb93ca932fb782be
+size 94098362