📢 Overview

This repository hosts a fine-tuned version of OpenAI's Whisper Small model for English Automatic Speech Recognition (ASR). Fine-tuned by Marwan Kasem, the model focuses on transcribing clean, headset-recorded English speech, especially from conversational scenarios.

Whisper-Small-Final (English) – Fine-Tuned ASR Model

🚀 Fine-tuned version of OpenAI's Whisper-small ** automatic speech recognition (ASR). This model was trained on carefully curated and preprocessed Normal life speech data to improve transcription accuracy on spoken English for better evaluating everday speech in real-world settings.

📚 Dataset

This model was fine-tuned using the AMI IHM Dataset — a subset of the AMI Meeting Corpus recorded via individual headset microphones. The dataset features:

Clear, channel-separated speech from multiple speakers
Realistic meeting environments
Conversational English with varying accents

🔎 Hugging Face Dataset: distil-whisper/ami-ihm

🧠 Model Details

Base model: openai/whisper-small
Fine-tuned on: AMI IHM (cleaned English meeting speech)
Tokenizer: WhisperProcessor
Intended use: English transcription from headset or clean recordings

🔧 Usage

You can use the model via Hugging Face's pipeline:

from transformers import pipeline

pipe = pipeline(model="Marwan-Kasem/whisper-small-Final")

def transcribe(audio_file):
    return pipe(audio_file)["text"]

Output Example : Input: [audio of spoken sentence] Output: "You need to move a little faster than that, son. Speed is life."

💻 Live Demo Test the model live with Gradio:

import gradio as gr

iface = gr.Interface(
    fn=transcribe,
    inputs=gr.Audio(source="microphone", type="filepath"),
    outputs="text",
    title="Whisper Small – English ASR",
    description="Real-time demo for English speech recognition using a fine-tuned Whisper Small model."
)

iface.launch()