added updates

Browse files

Files changed (5) hide show

README.md +75 -0
VALID_yoruba_yor_audio_data.csv +7 -0
afrospeech-wav2vec-yor_METRICS_VALID.json +1 -0
afrospeech-wav2vec-yor_confusion_matrix_VALID.png +0 -0
digits-bar-plot-for-afrospeech-wav2vec-yor.png +0 -0

README.md ADDED Viewed

	@@ -0,0 +1,75 @@

+---
+license: apache-2.0
+tags:
+- afro-digits-speech
+datasets:
+- crowd-speech-africa
+metrics:
+- accuracy
+model-index:
+- name: afrospeech-wav2vec-yor
+  results:
+  - task:
+      name: Audio Classification
+      type: audio-classification
+    dataset:
+      name: Afro Speech
+      type: chrisjay/crowd-speech-africa
+      args: no
+    metrics:
+       - name: Validation Accuracy
+         type: accuracy
+         value: 0.83
+---
+# afrospeech-wav2vec-yor
+This model is a fine-tuned version of [facebook/wav2vec2-base](https://huggingface.co/facebook/wav2vec2-base) on the [crowd-speech-africa](https://huggingface.co/datasets/chrisjay/crowd-speech-africa), which was a crowd-sourced dataset collected using the [afro-speech Space](https://huggingface.co/spaces/chrisjay/afro-speech). It achieves the following results on the [validation set](VALID_yoruba_yor_audio_data.csv):
+- F1: 0.83
+- Accuracy: 0.83
+The confusion matrix below helps to give a better look at the model's performance across the digits. Through it, we can see the precision and recall of the model as well as other important insights.
+![confusion matrix](afrospeech-wav2vec-yor_confusion_matrix_VALID.png)
+## Training and evaluation data
+The model was trained on a mixed audio data from Yoruba (`yor`).
+- Size of training set: 22
+- Size of validation set: 6
+Below is a distribution of the dataset (training and valdation)
+![digits-bar-plot-for-afrospeech](digits-bar-plot-for-afrospeech-wav2vec-yor.png)
+### Training hyperparameters
+The following hyperparameters were used during training:
+- learning_rate: 3e-05
+- train_batch_size: 64
+- eval_batch_size: 64
+- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
+- num_epochs: 150
+### Training results
+| Training Loss | Epoch |  Validation Accuracy |
+|:-------------:|:-----:|:--------:|
+|0.596        | 1    | 0.5  |
+| 0.0220       | 50   | 0.5  |
+|0.00305       | 100   | 0.667  |
+|0.0993      | 150   | 0.667  |
+### Framework versions
+- Transformers 4.21.3
+- Pytorch 1.12.0
+- Datasets 1.14.0
+- Tokenizers 0.12.1

VALID_yoruba_yor_audio_data.csv ADDED Viewed

	@@ -0,0 +1,7 @@

+audio_path,transcript,lang,lang_code,gender,age,country,accent
+AUDIO_HOMEPATH/data/YEPqZ3CHDppprriDgoGA9MvOgRNMf43F/audio.wav,3,yoruba,yor,Male,,Nigeria,Standard Yoruba
+AUDIO_HOMEPATH/data/Cn8ve720zqPppRPdI2TXbkwHNdmtVUPf/audio.wav,5,yoruba,yor,Male,,Nigeria,Standard Yoruba
+AUDIO_HOMEPATH/data/2mzLDDHVy5zA4cH5NPK0VU3rtZKVZ2kK/audio.wav,7,yoruba,yor,,,Nigeria,Standard Yoruba
+AUDIO_HOMEPATH/data/Xbiq3Nv8JSko22be3orDMcDurESjl5xc/audio.wav,7,yoruba,yor,Female,,United States,
+AUDIO_HOMEPATH/data/zb6CyNeMcgyfSk5LvFzhmjgDbeC2goeM/audio.wav,6,yoruba,yor,Female,,United States,
+AUDIO_HOMEPATH/data/Y2Kr1zV41BVuW3gsaK1LCSS0d8y1P023/audio.wav,9,yoruba,yor,Female,,United States,

afrospeech-wav2vec-yor_METRICS_VALID.json ADDED Viewed

	@@ -0,0 +1 @@


1	+ {"acc": 0.8333333333333334, "f1": 0.8333333333333334}

afrospeech-wav2vec-yor_confusion_matrix_VALID.png ADDED Viewed

digits-bar-plot-for-afrospeech-wav2vec-yor.png ADDED Viewed