Whisper Base ONNX

This is an ONNX conversion of OpenAI's whisper-base model, optimized for use with Transformers.js.

Model Details

  • Model Type: Whisper (Encoder-Decoder)
  • Task: Automatic Speech Recognition
  • Format: ONNX (INT8 Quantized)
  • Size: ~75MB (quantized from ~300MB)

Usage

import { pipeline } from '@huggingface/transformers';

const transcriber = await pipeline('automatic-speech-recognition', 'markusingvarsson/whisper-test');
const result = await transcriber('audio.wav');
console.log(result.text);

Conversion Details

This model was converted using a custom conversion pipeline that:

  1. Downloads the original HuggingFace model
  2. Exports to ONNX format with KV caching
  3. Applies INT8 quantization for smaller size
  4. Adds Whisper-specific alignment heads for timestamp support

The quantized models are approximately 4x smaller than the original while maintaining accuracy.

Downloads last month
61
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support