Whisper-Small Persian STT — LoRA Fine-Tuned

A fine-tuned version of openai/whisper-small for Persian speech-to-text (ASR) using LoRA.
This model is optimized for persian conversational speech and dataset-quality audio.

Model Details

Model Description

This model is a LoRA fine-tuned Whisper Small focused on Persian (fa) speech recognition.
It improves transcription accuracy on standard Persian audio segments (16kHz, mono, normalized WAV).

Developed by: Mehdi Pouladrag
Model type: Speech-to-Text (ASR) — Whisper Small (Seq2Seq Transformer)
Language(s): Persian (fa)
License: MIT (or your preferred license)
Finetuned from: openai/whisper-small
Dataset: persian-voice-v1 (single dataset)
Training technique: LoRA (Low-Rank Adaptation)

Model Sources

Repository: https://github.com/Mehdipoladrag/Fine-tuning-Whisper-Model

Uses

Direct Use

Convert Persian speech to text
Subtitle generation for Persian audio
Conversational ASR
Podcast / video transcription
General Persian content recognition

Downstream Use

Integrate into ASR pipelines
Use in real-time Persian voice applications
Further fine-tuning on custom Persian domains (medical, legal, etc.)

Out-of-Scope Use

Non-Persian audio
Low-quality/noisy multi-speaker overlapping speech
Misuse for surveillance or unethical monitoring

Bias, Risks, and Limitations

Whisper may still struggle with dialect-heavy, noisy, or low-quality audio.
The dataset used is relatively limited (~6099 audio–subtitle pairs), so:
- Certain accents may be underrepresented.
- Model may hallucinate or mis-transcribe in rare cases.

Recommendations

Users should:

Provide clean 16kHz mono WAV audio
Use domain-specific fine-tuning if necessary
Validate outputs before critical use

Downloads last month: 27

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for DevMehdip/whisper-small-fa-lora

Base model

openai/whisper-small

Adapter

(149)

this model

DevMehdip
/

whisper-small-fa-lora