Edit model card
YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

wavlm-large_finetuned_RAVDESS

This model is a fine-tuned version of microsoft/wavlm-large on the Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS). It achieves the following results on the evaluation set:

  • Loss: 0.3534
  • Accuracy: 0.9028

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0003
  • train_batch_size: 32
  • eval_batch_size: 32
  • seed: 42
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 128
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 20

Training results

Training Loss Epoch Step Validation Loss Accuracy
No log 1.0 9 2.0485 0.2326
2.0712 2.0 18 1.8028 0.2917
1.9355 3.0 27 1.7300 0.3229
1.7116 4.0 36 1.3749 0.4722
1.4907 5.0 45 1.0586 0.6493
1.1558 6.0 54 0.8834 0.6771
0.8621 7.0 63 0.9206 0.6944
0.6437 8.0 72 0.5895 0.8194
0.4634 9.0 81 0.7389 0.7743
0.3974 10.0 90 0.4569 0.8542
0.3974 11.0 99 0.5140 0.8438
0.3105 12.0 108 0.4273 0.8611
0.2094 13.0 117 0.3608 0.8993
0.1401 14.0 126 0.5715 0.8194
0.1249 15.0 135 0.3715 0.8854
0.0953 16.0 144 0.4112 0.8785
0.0955 17.0 153 0.3692 0.9062
0.0807 18.0 162 0.4395 0.8646
0.1077 19.0 171 0.3413 0.9201
0.0578 20.0 180 0.3534 0.9028

Framework versions

  • Transformers 4.45.2
  • Pytorch 2.5.0+cu118
  • Datasets 3.0.1
  • Tokenizers 0.20.1
Downloads last month
63
Safetensors
Model size
316M params
Tensor type
F32
·
Inference API
Unable to determine this model's library. Check the docs .