Whisper Large V3 Turbo - Pruna Smashed

Pruna-optimized version of Whisper Large V3 Turbo. Compressed with c_whisper compiler for faster inference and lower VRAM usage, same transcription quality.

Usage

Best performance:

from pruna import PrunaModel
model = PrunaModel.from_pretrained("manohar03/unsloth-whisper-large-v3-turbo-pruna-smashed")
result = model("audio.wav")

Standard transformers:

from transformers import AutoModelForSpeechSeq2Seq, AutoProcessor
model = AutoModelForSpeechSeq2Seq.from_pretrained("manohar03/unsloth-whisper-large-v3-turbo-pruna-smashed")
processor = AutoProcessor.from_pretrained("manohar03/unsloth-whisper-large-v3-turbo-pruna-smashed")

Tested on T4 GPU. ```

Downloads last month
3
Safetensors
Model size
809M params
Tensor type
F16
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for manohar03/unsloth-whisper-large-v3-turbo-pruna-smashed

Finetuned
(50)
this model