Distil-Whisper: Distil-Large-v3.5 for OpenAI Whisper
This repository contains the model weights for distil-large-v3.5 converted to OpenAI Whisper format.
Python Usage
To use the model in the original Whisper format, first ensure you have the openai-whisper
package installed.
For this example, we'll also install ๐ค Datasets to load a toy audio dataset from the Hugging Face Hub:
pip install --upgrade pip
pip install --upgrade openai-whisper datasets[audio]
The following code-snippet demonstrates how to transcribe a sample file from the LibriSpeech dataset loaded using ๐ค Datasets:
from datasets import load_dataset
from huggingface_hub import hf_hub_download
import whisper
model_path = hf_hub_download(repo_id="distil-whisper/distil-large-v3.5-openai", filename="model.bin")
model = whisper.load_model(model_path)
dataset = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
sample = dataset[0]["audio"]["path"]
result = model.transcribe(sample, language="en")
print(result["text"])
Note that the model weights will be downloaded and saved to your cache the first time you run the example. Subsequently, you can re-use the same example, and the weights will be loaded directly from your cache without having to download them again.
To transcribe a local audio file, simply pass the path to the audio file as the audio
argument to transcribe:
result = model.transcribe(model, audio="audio.mp3", language="en")
print(result["text"])
CLI Usage
The Distil-Whisper model can also be used with the OpenAI Whisper CLI. First, pip install the Hugging Face Hub package:
pip install --upgrade huggingface_hub
Next, download the weights for distil-large-v3 locally:
huggingface-cli download distil-whisper/distil-large-v3.5-openai model.bin --local-dir distil-large-v3.5
Finally, use the OpenAI Whisper CLI to transcribe:
whisper audio.mp3 --model distil-large-v3.5/model.bin --language en
Model Details
For more information about the Distil-Large-v3.5 model, refer to the original model card.
License
Distil-Whisper inherits the MIT license from OpenAI's Whisper model.
Citation
If you use this model, please consider citing the Distil-Whisper paper:
@misc{gandhi2023distilwhisper,
title={Distil-Whisper: Robust Knowledge Distillation via Large-Scale Pseudo Labelling},
author={Sanchit Gandhi and Patrick von Platen and Alexander M. Rush},
year={2023},
eprint={2311.00430},
archivePrefix={arXiv},
primaryClass={cs.CL}
}