Whisper-Small Persian STT — LoRA Fine-Tuned
A fine-tuned version of openai/whisper-small for Persian speech-to-text (ASR) using LoRA.
This model is optimized for persian conversational speech and dataset-quality audio.
Model Details
Model Description
This model is a LoRA fine-tuned Whisper Small focused on Persian (fa) speech recognition.
It improves transcription accuracy on standard Persian audio segments (16kHz, mono, normalized WAV).
- Developed by: Mehdi Pouladrag
- Model type: Speech-to-Text (ASR) — Whisper Small (Seq2Seq Transformer)
- Language(s): Persian (fa)
- License: MIT (or your preferred license)
- Finetuned from:
openai/whisper-small - Dataset:
persian-voice-v1(single dataset) - Training technique: LoRA (Low-Rank Adaptation)
Model Sources
Uses
Direct Use
- Convert Persian speech to text
- Subtitle generation for Persian audio
- Conversational ASR
- Podcast / video transcription
- General Persian content recognition
Downstream Use
- Integrate into ASR pipelines
- Use in real-time Persian voice applications
- Further fine-tuning on custom Persian domains (medical, legal, etc.)
Out-of-Scope Use
- Non-Persian audio
- Low-quality/noisy multi-speaker overlapping speech
- Misuse for surveillance or unethical monitoring
Bias, Risks, and Limitations
- Whisper may still struggle with dialect-heavy, noisy, or low-quality audio.
- The dataset used is relatively limited (~6099 audio–subtitle pairs), so:
- Certain accents may be underrepresented.
- Model may hallucinate or mis-transcribe in rare cases.
Recommendations
Users should:
- Provide clean 16kHz mono WAV audio
- Use domain-specific fine-tuning if necessary
- Validate outputs before critical use
- Downloads last month
- 27
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support
Model tree for DevMehdip/whisper-small-fa-lora
Base model
openai/whisper-small