|
---
|
|
language:
|
|
- ar
|
|
pipeline_tag: audio-classification
|
|
library_name: speechbrain
|
|
tags:
|
|
- DIalectID
|
|
- ADI
|
|
- ADI-20
|
|
- speechbrain
|
|
- Identification
|
|
- pytorch
|
|
- embeddings
|
|
datasets:
|
|
- ADI-20
|
|
metrics:
|
|
- f1
|
|
- precision
|
|
- recall
|
|
- accuracy
|
|
---
|
|
|
|
## Install Requirements
|
|
|
|
### SpeechBrain
|
|
First of all, please install SpeechBrain with the following command:
|
|
|
|
```bash
|
|
pip install git+https://github.com/speechbrain/speechbrain.git@develop
|
|
```
|
|
|
|
### Clone ADI github repository
|
|
```bash
|
|
git clone https://github.com/elyadata/ADI-20
|
|
cd ADI-20
|
|
pip install -r requirements.txt
|
|
```
|
|
|
|
|
|
### Perform Arabic Dialect Identification
|
|
```python
|
|
from inference.classifier_attention_pooling import WhisperDialectClassifier
|
|
|
|
dialect_id = WhisperDialectClassifier.from_hparams(
|
|
source="",
|
|
hparams_file="hyperparms.yaml",
|
|
savedir="pretrained_DID/tmp").to("cuda")
|
|
|
|
dialect_id.device = "cuda"
|
|
|
|
dialect_id.classify_file("filenane.wav")
|
|
```
|
|
|
|
### Citation
|
|
If using this work, please cite:
|
|
```
|
|
@inproceedings{elleuch2025adi20,
|
|
author = {Haroun Elleuch and Salima Mdhaffar and Yannick Estève and Fethi Bougares},
|
|
title = {ADI‑20: Arabic Dialect Identification Dataset and Models},
|
|
booktitle = {Proceedings of the Annual Conference of the International Speech Communication Association (Interspeech)},
|
|
year = {2025},
|
|
address = {Rotterdam Ahoy Convention Centre, Rotterdam, The Netherlands},
|
|
month = {August},
|
|
days = {17‑21}
|
|
}
|
|
``` |