---
license: apache-2.0
datasets:
- Oumar199/french_wolof_corpus
language:
- fr
- wo
metrics:
- bleu
- rouge
base_model:
- facebook/nllb-200-distilled-600M
pipeline_tag: translation
library_name: transformers
---

# Fine-tuned NLLB-200 Distilled for French-to-Wolof Translation

This model is a fine-tuned version of the [NLLB-200 Distilled 600M](https://huggingface.co/facebook/nllb-200-distilled-600M) model designed specifically for **French-to-Wolof translation**.

The fine-tuning process leveraged the methodologies and algorithmic insights presented in the paper **"Advancing Wolof-French Sentence Translation: Comparative Analysis of Transformer-Based Models and Methodological Insights"** by Kane et al., which provides a comprehensive analysis of transformer architecture adaptations for Wolof-French sentence translation.

## Model Description

By adapting and fine-tuning NLLB-200 Distilled, a state-of-the-art multilingual sequence-to-sequence model, this model improves translation quality for the French-Wolof language pair, focusing on enhanced representation and transfer learning techniques suited to low-resource languages.

## Intended Use

- Translate text from French to Wolof.
- Support NLP applications and research involving Wolof language processing, especially machine translation.
- Enable further fine-tuning or transfer learning for similar African language pairs.

## How to Use

```python
from transformers import AutoModelForSeq2SeqLM, AutoTokenizer

model_name = "Oumar199/nllb_french_wolof"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSeq2SeqLM.from_pretrained(model_name)

inputs = tokenizer("Bonjour. Est-ce que le poulet est cuit ?", return_tensors="pt")
outputs = model.generate(**inputs)
translation = tokenizer.decode(outputs, skip_special_tokens=True)

print(translation) # Expected output: Jàmm nga am ? Ndax ganaar gi ñor na ?
```


## Limitations and Bias

- Wolof is a low-resource language, so performance might vary with complex or uncommon phrases.
- Translation quality depends on the similarity of input text to training distribution.
- **The choice of generation hyperparameters (such as temperature, number of beams, top-k, max length, etc.) can significantly influence output quality. For optimal performance, these parameters should be carefully fine-tuned for your specific use case or evaluation setting. Default values may not yield best results and experimentation is recommended.**

## Training Data

Fine-tuned on a curated parallel corpus of French-Wolof sentences assembled to capture diverse linguistic phenomena as outlined in Kane et al.

## Training Procedure

Fine-tuning used the NLLB-200 Distilled architecture with hyperparameters and techniques inspired by the comparative analysis in the Kane et al. paper, optimizing for BLEU score and translation fluency.

## Evaluation

Evaluated using BLEU and ROUGE scores against a held-out Wolof-French test set, demonstrating improved translation accuracy and quality compared to baseline NLLB-200 models without fine-tuning.

---

*Model card created to ensure transparency, reproducibility, and assist the community interested in African language NLP.*