You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.

πŸ“’ Overview

This repository hosts a fine-tuned version of OpenAI's Whisper Small model for English Automatic Speech Recognition (ASR). Fine-tuned by Marwan Kasem, the model focuses on transcribing clean, headset-recorded English speech, especially from conversational scenarios.

Whisper-Small-Final (English) – Fine-Tuned ASR Model

πŸš€ Fine-tuned version of OpenAI's Whisper-small ** automatic speech recognition (ASR). This model was trained on carefully curated and preprocessed Normal life speech data to improve transcription accuracy on spoken English for better evaluating everday speech in real-world settings.

πŸ“š Dataset

This model was fine-tuned using the AMI IHM Dataset β€” a subset of the AMI Meeting Corpus recorded via individual headset microphones. The dataset features:

  • Clear, channel-separated speech from multiple speakers
  • Realistic meeting environments
  • Conversational English with varying accents

πŸ”Ž Hugging Face Dataset: distil-whisper/ami-ihm

🧠 Model Details

  • Base model: openai/whisper-small
  • Fine-tuned on: AMI IHM (cleaned English meeting speech)
  • Tokenizer: WhisperProcessor
  • Intended use: English transcription from headset or clean recordings

πŸ”§ Usage

You can use the model via Hugging Face's pipeline:

from transformers import pipeline

pipe = pipeline(model="Marwan-Kasem/whisper-small-Final")

def transcribe(audio_file):
    return pipe(audio_file)["text"]

Output Example : Input: [audio of spoken sentence] Output: "You need to move a little faster than that, son. Speed is life."

πŸ’» Live Demo Test the model live with Gradio:

import gradio as gr

iface = gr.Interface(
    fn=transcribe,
    inputs=gr.Audio(source="microphone", type="filepath"),
    outputs="text",
    title="Whisper Small – English ASR",
    description="Real-time demo for English speech recognition using a fine-tuned Whisper Small model."
)

iface.launch()

πŸ“‚ Included Files

  • training args.bin
  • config.json
  • generation_config.json
  • tokenizer_config.json
  • model.safetensors
  • optimizer.pt
  • sheduler.pt
  • preprocessor_config.json

πŸ‘€ Author Name: Marwan Kasem

Mail: [email protected]

GitHub: https://github.com/MarwanKAsem

LinkedIN: https://www.linkedin.com/in/marwan-kasem-447009221

Specialty: Junior NLP ENG

Downloads last month
213
Safetensors
Model size
242M params
Tensor type
F32
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for Marwan-Kasem/whisper-small-Final

Finetuned
(2565)
this model

Dataset used to train Marwan-Kasem/whisper-small-Final