---
library_name: peft
license: apache-2.0
base_model: openai/whisper-large-v2
tags:
- automatic-speech-recognition
- whisper
- asr
- songhoy
- hsn
- Mali
- MALIBA-AI
- lora
- fine-tuned
- code-switching
- african-language
language:
- hsn
- fr
language_bcp47:
- hsn-ML
- fr-ML
model-index:
- name: songhoy-asr-v1
  results:
  - task:
      name: Automatic Speech Recognition
      type: automatic-speech-recognition
    dataset:
      name: songhoy-asr
      type: custom
      split: test
      args:
        language: hsn
    metrics:
    - name: WER
      type: wer
      value: 16.58
    - name: CER
      type: cer
      value: 4.63
pipeline_tag: automatic-speech-recognition
---

# Songhoy-ASR-v1: First Open-Source Speech Recognition Model for Songhoy

Songhoy-ASR-v1 represents a historic milestone as the **first open-source speech recognition model** for Songhoy, a language spoken by over 3 million people across Mali, Niger, and Burkina Faso. Developed as part of the MALIBA-AI initiative, this groundbreaking model not only achieves impressive accuracy but opens the door to speech technology for Songhoy speakers for the very first time.

## Model Overview

This model demonstrates exceptional performance for Songhoy speech recognition, with particularly strong capabilities in:

- **Pure Songhoy recognition**: Accurate transcription of traditional and contemporary Songhoy speech
- **Code-switching handling**: Effectively manages the natural mixing of Songhoy with French
- **Dialect adaptation**: Works across regional variations of Songhoy
- **Noise resilience**: Maintains accuracy even with moderate background noise

## Impressive Performance Metrics

Songhoy-ASR-v1 achieves breakthrough results on our test dataset:

| Metric | Value | 
|--------|-------|
| Word Error Rate (WER) | 16.58% |
| Character Error Rate (CER) | 4.63% |

These results represent the best publicly available performance for Songhoy speech recognition, making this model suitable for production applications.

## Technical Details

The model is a fine-tuned version of OpenAI's Whisper-large-v2, adapted specifically for Songhoy using LoRA (Low-Rank Adaptation). This efficient fine-tuning approach allowed us to achieve excellent results while maintaining the multilingual capabilities of the base model.

### Training Information
- **Base Model**: openai/whisper-large-v2
- **Fine-tuning Method**: LoRA (Parameter-Efficient Fine-Tuning)
- **Training Dataset**: [coming soon]
- **Training Duration**: 4 epochs
- **Batch Size**: 32 (8 per device with gradient accumulation steps of 4)
- **Learning Rate**: 0.001 with linear scheduler and 50 warmup steps
- **Mixed Precision**: Native AMP

### Training Results

| Training Loss | Epoch  | Step | Validation Loss |
|:-------------:|:------:|:----:|:---------------:|
| 0.3661        | 1.0    | 245  | 0.3118          |
| 0.2712        | 2.0    | 490  | 0.2215          |
| 0.2008        | 3.0    | 735  | 0.2011          |
| 0.1518        | 3.9857 | 976  | 0.1897          |

## Real-World Applications

Songhoy-ASR-v1 enables numerous applications previously unavailable to Songhoy speakers:

- **Media Transcription**: Automatic subtitling of Songhoy content
- **Voice Interfaces**: Voice-controlled applications in Songhoy
- **Educational Tools**: Language learning and literacy applications
- **Cultural Preservation**: Documentation of oral histories and traditions
- **Healthcare Communication**: Improved access to health information
- **Accessibility Solutions**: Tools for the hearing impaired

## Usage Examples

```
  Coming soon
```

## Limitations

[Coming Soon]
<!-- 
- Performance varies with different regional dialects of Songhoy
- Very specific technical terminology may have lower accuracy
- Extreme background noise can impact transcription quality
- Very young speakers or non-native speakers may have reduced accuracy
- Limited performance with extremely low-quality audio recordings -->

## Part of MALIBA-AI's African Language Initiative

Songhoy-ASR-v1 is part of MALIBA-AI's commitment to developing speech technology for all Malian languages. This model represents a significant step toward digital inclusion for Songhoy speakers and demonstrates the potential for high-quality AI systems for African languages.

Our mission of "No Malian Language Left Behind" drives us to develop technologies that:
- Preserve linguistic diversity
- Enable access to digital tools regardless of language
- Support local innovation and content creation
- Bridge the digital divide for all Malians

## Framework Versions
- PEFT 0.14.1.dev0
- Transformers 4.50.0.dev0
- PyTorch 2.5.1+cu124
- Datasets 3.2.0
- Tokenizers 0.21.0

## License

This model is released under the Apache 2.0 license.

## Citation

```bibtex
@misc{songhoy-asr-v1,
  author = {MALIBA-AI},
  title = {Songhoy-ASR-v1: Speech Recognition for Songhoy},
  year = {2025},
  publisher = {HuggingFace},
  howpublished = {\url{https://huggingface.co/MALIBA-AI/songhoy-asr-v1}}
}
```

---

**MALIBA-AI: Empowering Mali's Future Through Community-Driven AI Innovation**

*"No Malian Language Left Behind"*