Songhoy-ASR-v1: First Open-Source Speech Recognition Model for Songhoy

Songhoy-ASR-v1 represents a historic milestone as the first open-source speech recognition model for Songhoy, a language spoken by over 3 million people across Mali, Niger, and Burkina Faso. Developed as part of the MALIBA-AI initiative, this groundbreaking model not only achieves impressive accuracy but opens the door to speech technology for Songhoy speakers for the very first time.

Model Overview

This model demonstrates exceptional performance for Songhoy speech recognition, with particularly strong capabilities in:

Pure Songhoy recognition: Accurate transcription of traditional and contemporary Songhoy speech
Code-switching handling: Effectively manages the natural mixing of Songhoy with French
Dialect adaptation: Works across regional variations of Songhoy
Noise resilience: Maintains accuracy even with moderate background noise

Impressive Performance Metrics

Songhoy-ASR-v1 achieves breakthrough results on our test dataset:

Metric	Value
Word Error Rate (WER)	16.58%
Character Error Rate (CER)	4.63%

These results represent the best publicly available performance for Songhoy speech recognition, making this model suitable for production applications.

Technical Details

The model is a fine-tuned version of OpenAI's Whisper-large-v2, adapted specifically for Songhoy using LoRA (Low-Rank Adaptation). This efficient fine-tuning approach allowed us to achieve excellent results while maintaining the multilingual capabilities of the base model.

Training Information

Base Model: openai/whisper-large-v2
Fine-tuning Method: LoRA (Parameter-Efficient Fine-Tuning)
Training Dataset: [coming soon]
Training Duration: 4 epochs
Batch Size: 32 (8 per device with gradient accumulation steps of 4)
Learning Rate: 0.001 with linear scheduler and 50 warmup steps
Mixed Precision: Native AMP

Training Results

Training Loss	Epoch	Step	Validation Loss
0.3661	1.0	245	0.3118
0.2712	2.0	490	0.2215
0.2008	3.0	735	0.2011
0.1518	3.9857	976	0.1897

Real-World Applications

Songhoy-ASR-v1 enables numerous applications previously unavailable to Songhoy speakers:

Media Transcription: Automatic subtitling of Songhoy content
Voice Interfaces: Voice-controlled applications in Songhoy
Educational Tools: Language learning and literacy applications
Cultural Preservation: Documentation of oral histories and traditions
Healthcare Communication: Improved access to health information
Accessibility Solutions: Tools for the hearing impaired

Usage Examples

  Coming soon

Limitations

[Coming Soon]

Part of MALIBA-AI's African Language Initiative

Songhoy-ASR-v1 is part of MALIBA-AI's commitment to developing speech technology for all Malian languages. This model represents a significant step toward digital inclusion for Songhoy speakers and demonstrates the potential for high-quality AI systems for African languages.

Our mission of "No Malian Language Left Behind" drives us to develop technologies that:

Preserve linguistic diversity
Enable access to digital tools regardless of language
Support local innovation and content creation
Bridge the digital divide for all Malians

Framework Versions

PEFT 0.14.1.dev0
Transformers 4.50.0.dev0
PyTorch 2.5.1+cu124
Datasets 3.2.0
Tokenizers 0.21.0

License

This model is released under the Apache 2.0 license.

Citation

@misc{songhoy-asr-v1,
  author = {MALIBA-AI},
  title = {Songhoy-ASR-v1: Speech Recognition for Songhoy},
  year = {2025},
  publisher = {HuggingFace},
  howpublished = {\url{https://huggingface.co/MALIBA-AI/songhoy-asr-v1}}
}

MALIBA-AI: Empowering Mali's Future Through Community-Driven AI Innovation

"No Malian Language Left Behind"

MALIBA-AI
/

songhoy-asr