Whisper Large-v3 β€” Fine-tuned on Slovak Plenary ASR Corpus

This model is a fine-tuned version of openai/whisper-large-v3.
It is adapted for Slovak ASR using SloPalSpeech: 2,806 hours of aligned, ≀30 s speech–text pairs from official plenary sessions of the Slovak National Council.

  • Language: Slovak
  • Domain: Parliamentary / formal speech
  • Training data: 2,806 h
  • Intended use: Slovak speech recognition; strongest in formal/public-speaking contexts

πŸ§ͺ Evaluation

Dataset Base WER Fine-tuned WER Ξ” (abs)
Common Voice 21 (sk) 20.8 11.6 -9.2
FLEURS (sk) 9.2 5.5 -3.7

Numbers from the paper’s final benchmark runs.

πŸ”§ Training Details

  • Framework: Hugging Face Transformers
  • Hardware: Multi-GPU setup (NVIDIA A10s) with Fully Sharded Data Parallel (FSDP)
  • Epochs: ~2 with early stopping on validation WER
  • Learning rate: 1e-5 with weight decay 0.01 to prevent overfitting
  • Notes: Training required sharded checkpoints; evaluation run separately due to runtime compatibility issues

⚠️ Limitations

  • Domain bias toward parliamentary speech (e.g., political vocabulary, formal register).
  • As with Whisper models generally, occasional hallucinations may appear; consider temperature fallback / compression-ratio checks at inference time.
  • Multilingual performance is not guaranteed (full-parameter finetuning emphasized Slovak).

πŸ“ Citation & Paper

For more details, please see our paper on arXiv. If you use this model in your work, please cite it as:

@misc{boΕΎΓ­k2025slopalspeech2800hourslovakspeech,
      title={SloPalSpeech: A 2,800-Hour Slovak Speech Corpus from Parliamentary Data}, 
      author={Erik BoΕΎΓ­k and Marek Ε uppa},
      year={2025},
      eprint={2509.19270},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2509.19270}, 
}

πŸ™ Acknowledgements

This work was supported by VÚB Banka who provided the GPU resources and backing necessary to accomplish it, enabling progress in Slovak ASR research.

Downloads last month
46
Safetensors
Model size
1.54B params
Tensor type
F32
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for erikbozik/whisper-large-v3-sk

Finetuned
(639)
this model

Dataset used to train erikbozik/whisper-large-v3-sk

Evaluation results