Whisper Base ONNX
This is an ONNX conversion of OpenAI's whisper-base model, optimized for use with Transformers.js.
Model Details
- Model Type: Whisper (Encoder-Decoder)
- Task: Automatic Speech Recognition
- Format: ONNX (INT8 Quantized)
- Size: ~75MB (quantized from ~300MB)
Usage
import { pipeline } from '@huggingface/transformers';
const transcriber = await pipeline('automatic-speech-recognition', 'markusingvarsson/whisper-test');
const result = await transcriber('audio.wav');
console.log(result.text);
Conversion Details
This model was converted using a custom conversion pipeline that:
- Downloads the original HuggingFace model
- Exports to ONNX format with KV caching
- Applies INT8 quantization for smaller size
- Adds Whisper-specific alignment heads for timestamp support
The quantized models are approximately 4x smaller than the original while maintaining accuracy.
- Downloads last month
- 61