wav2vec2-base-Drum_Kit_Sounds
This model is a fine-tuned version of facebook/wav2vec2-base.
It achieves the following results on the evaluation set:
- Loss: 1.0887
- Accuracy: 0.7812
- F1
- Weighted: 0.7692
- Micro: 0.7812
- Macro: 0.7845
- Recall
- Weighted: 0.7812
- Micro: 0.7812
- Macro: 0.8187
- Precision
- Weighted: 0.8717
- Micro: 0.7812
- Macro: 0.8534
Model description
This is a multiclass classification of sounds to determine which type of drum is hit in the audio sample. The options are: kick, overheads, snare, and toms.
For more information on how it was created, check out the following link: https://github.com/DunnBC22/Vision_Audio_and_Multimodal_Projects/blob/main/Audio-Projects/Classification/Audio-Drum_Kit_Sounds.ipynb
Intended uses & limitations
This model is intended to demonstrate my ability to solve a complex problem using technology.
Training and evaluation data
Dataset Source: https://www.kaggle.com/datasets/anubhavchhabra/drum-kit-sound-samples
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 3e-05
- train_batch_size: 32
- eval_batch_size: 32
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- lr_scheduler_warmup_ratio: 0.1
- num_epochs: 12
Training results
Training Loss | Epoch | Step | Validation Loss | Accuracy | Weighted F1 | Micro F1 | Macro F1 | Weighted Recall | Micro Recall | Macro Recall | Weighted Precision | Micro Precision | Macro Precision |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1.3743 | 1.0 | 4 | 1.3632 | 0.5625 | 0.5801 | 0.5625 | 0.5678 | 0.5625 | 0.5625 | 0.5670 | 0.6786 | 0.5625 | 0.6429 |
1.3074 | 2.0 | 8 | 1.3149 | 0.3438 | 0.2567 | 0.3438 | 0.2696 | 0.3438 | 0.3438 | 0.375 | 0.3067 | 0.3438 | 0.3148 |
1.2393 | 3.0 | 12 | 1.3121 | 0.2188 | 0.0785 | 0.2188 | 0.0897 | 0.2188 | 0.2188 | 0.25 | 0.0479 | 0.2188 | 0.0547 |
1.2317 | 4.0 | 16 | 1.3112 | 0.2812 | 0.1800 | 0.2812 | 0.2057 | 0.2812 | 0.2812 | 0.3214 | 0.2698 | 0.2812 | 0.3083 |
1.2107 | 5.0 | 20 | 1.2604 | 0.4375 | 0.3030 | 0.4375 | 0.3462 | 0.4375 | 0.4375 | 0.5 | 0.2552 | 0.4375 | 0.2917 |
1.1663 | 6.0 | 24 | 1.2112 | 0.4688 | 0.3896 | 0.4688 | 0.4310 | 0.4688 | 0.4688 | 0.5268 | 0.5041 | 0.4688 | 0.5404 |
1.1247 | 7.0 | 28 | 1.1746 | 0.5938 | 0.5143 | 0.5938 | 0.5603 | 0.5938 | 0.5938 | 0.6562 | 0.5220 | 0.5938 | 0.5609 |
1.0856 | 8.0 | 32 | 1.1434 | 0.5938 | 0.5143 | 0.5938 | 0.5603 | 0.5938 | 0.5938 | 0.6562 | 0.5220 | 0.5938 | 0.5609 |
1.0601 | 9.0 | 36 | 1.1417 | 0.6562 | 0.6029 | 0.6562 | 0.6389 | 0.6562 | 0.6562 | 0.7125 | 0.8440 | 0.6562 | 0.8217 |
1.0375 | 10.0 | 40 | 1.1227 | 0.6875 | 0.6582 | 0.6875 | 0.6831 | 0.6875 | 0.6875 | 0.7330 | 0.8457 | 0.6875 | 0.8237 |
1.0168 | 11.0 | 44 | 1.1065 | 0.7812 | 0.7692 | 0.7812 | 0.7845 | 0.7812 | 0.7812 | 0.8187 | 0.8717 | 0.7812 | 0.8534 |
1.0093 | 12.0 | 48 | 1.0887 | 0.7812 | 0.7692 | 0.7812 | 0.7845 | 0.7812 | 0.7812 | 0.8187 | 0.8717 | 0.7812 | 0.8534 |
Framework versions
- Transformers 4.25.1
- Pytorch 1.12.1
- Datasets 2.8.0
- Tokenizers 0.12.1
- Downloads last month
- 16
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.