Whisper Large-v3 — Fine-tuned on Slovak Plenary ASR Corpus

This model is a fine-tuned version of openai/whisper-large-v3.
It is adapted for Slovak ASR using SloPalSpeech: 2,806 hours of aligned, ≤30 s speech–text pairs from official plenary sessions of the Slovak National Council.

Language: Slovak
Domain: Parliamentary / formal speech
Training data: 2,806 h
Intended use: Slovak speech recognition; strongest in formal/public-speaking contexts

🧪 Evaluation

Dataset	Base WER	Fine-tuned WER	Δ (abs)
Common Voice 21 (sk)	20.8	11.6	-9.2
FLEURS (sk)	9.2	5.5	-3.7

Numbers from the paper’s final benchmark runs.

🔧 Training Details

Framework: Hugging Face Transformers
Hardware: Multi-GPU setup (NVIDIA A10s) with Fully Sharded Data Parallel (FSDP)
Epochs: ~2 with early stopping on validation WER
Learning rate: 1e-5 with weight decay 0.01 to prevent overfitting
Notes: Training required sharded checkpoints; evaluation run separately due to runtime compatibility issues

⚠️ Limitations

Domain bias toward parliamentary speech (e.g., political vocabulary, formal register).
As with Whisper models generally, occasional hallucinations may appear; consider temperature fallback / compression-ratio checks at inference time.
Multilingual performance is not guaranteed (full-parameter finetuning emphasized Slovak).

📝 Citation & Paper

For more details, please see our paper on arXiv. If you use this model in your work, please cite it as:

@misc{božík2025slopalspeech2800hourslovakspeech,
      title={SloPalSpeech: A 2,800-Hour Slovak Speech Corpus from Parliamentary Data}, 
      author={Erik Božík and Marek Šuppa},
      year={2025},
      eprint={2509.19270},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2509.19270}, 
}

🙏 Acknowledgements

This work was supported by VÚB Banka who provided the GPU resources and backing necessary to accomplish it, enabling progress in Slovak ASR research.

Downloads last month: 46

Safetensors

Model size

1.54B params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for erikbozik/whisper-large-v3-sk

Base model

openai/whisper-large-v3

Finetuned

(639)

this model

Dataset used to train erikbozik/whisper-large-v3-sk

Evaluation results

WER on Common Voice 21 (Slovak test set)
self-reported

11.600
WER on FLEURS (Slovak test set)
self-reported

5.500

View on Papers With Code