Add/update the quantized ONNX model files and README.md for Transformers.js v3
Applied Quantizations
✅ Based on decoder_model.onnx
with slimming
↳ ✅ fp16
: decoder_model_fp16.onnx
(added)
↳ ✅ int8
: decoder_model_int8.onnx
(added)
↳ ✅ uint8
: decoder_model_uint8.onnx
(added)
↳ ✅ q4
: decoder_model_q4.onnx
(added)
↳ ✅ q4f16
: decoder_model_q4f16.onnx
(added)
↳ ✅ bnb4
: decoder_model_bnb4.onnx
(added)
✅ Based on decoder_model.onnx
with slimming
↳ ✅ fp16
: decoder_model_fp16.onnx
(added)
↳ ✅ int8
: decoder_model_int8.onnx
(added)
↳ ✅ uint8
: decoder_model_uint8.onnx
(added)
↳ ✅ q4
: decoder_model_q4.onnx
(added)
↳ ✅ q4f16
: decoder_model_q4f16.onnx
(added)
↳ ✅ bnb4
: decoder_model_bnb4.onnx
(added)
✅ Based on encoder_model.onnx
with slimming
↳ ✅ int8
: encoder_model_int8.onnx
(added)
↳ ✅ uint8
: encoder_model_uint8.onnx
(added)
↳ ✅ q4
: encoder_model_q4.onnx
(added)
↳ ✅ q4f16
: encoder_model_q4f16.onnx
(added)
↳ ✅ bnb4
: encoder_model_bnb4.onnx
(added)
✅ Based on encoder_model.onnx
with slimming
↳ ✅ int8
: encoder_model_int8.onnx
(added)
↳ ✅ uint8
: encoder_model_uint8.onnx
(added)
↳ ✅ q4
: encoder_model_q4.onnx
(added)
↳ ✅ q4f16
: encoder_model_q4f16.onnx
(added)
↳ ✅ bnb4
: encoder_model_bnb4.onnx
(added)
✅ Based on decoder_with_past_model.onnx
with slimming
↳ ✅ fp16
: decoder_with_past_model_fp16.onnx
(added)
↳ ✅ int8
: decoder_with_past_model_int8.onnx
(added)
↳ ✅ uint8
: decoder_with_past_model_uint8.onnx
(added)
↳ ✅ q4
: decoder_with_past_model_q4.onnx
(added)
↳ ✅ q4f16
: decoder_with_past_model_q4f16.onnx
(added)
↳ ✅ bnb4
: decoder_with_past_model_bnb4.onnx
(added)
✅ Based on decoder_with_past_model.onnx
with slimming
↳ ✅ fp16
: decoder_with_past_model_fp16.onnx
(added)
↳ ✅ int8
: decoder_with_past_model_int8.onnx
(added)
↳ ✅ uint8
: decoder_with_past_model_uint8.onnx
(added)
↳ ✅ q4
: decoder_with_past_model_q4.onnx
(added)
↳ ✅ q4f16
: decoder_with_past_model_q4f16.onnx
(added)
↳ ✅ bnb4
: decoder_with_past_model_bnb4.onnx
(added)
✅ Based on decoder_model_merged.onnx
with slimming
↳ ✅ fp16
: decoder_model_merged_fp16.onnx
(replaced because it was invalid)
↳ ✅ int8
: decoder_model_merged_int8.onnx
(added)
↳ ✅ uint8
: decoder_model_merged_uint8.onnx
(added)
↳ ✅ q4
: decoder_model_merged_q4.onnx
(added)
↳ ✅ q4f16
: decoder_model_merged_q4f16.onnx
(added)
↳ ✅ bnb4
: decoder_model_merged_bnb4.onnx
(added)
✅ Based on decoder_model_merged.onnx
with slimming
↳ ✅ fp16
: decoder_model_merged_fp16.onnx
(replaced because it was invalid)
↳ ✅ int8
: decoder_model_merged_int8.onnx
(added)
↳ ✅ uint8
: decoder_model_merged_uint8.onnx
(added)
↳ ✅ q4
: decoder_model_merged_q4.onnx
(added)
↳ ✅ q4f16
: decoder_model_merged_q4f16.onnx
(added)
↳ ✅ bnb4
: decoder_model_merged_bnb4.onnx
(added)
LGTM!