Khoa commited on
Commit
abb0653
1 Parent(s): 54613a3

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +82 -3
README.md CHANGED
@@ -1,3 +1,82 @@
1
- ---
2
- license: bsd
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ### Wav2Vec2 Speech Emotion Recognition for English
2
+
3
+ ## Model Overview
4
+ This model is fine-tuned for recognizing emotions in English speech using the Wav2Vec2 architecture. It is capable of detecting the following emotions:
5
+
6
+ - Sadness
7
+ - Anger
8
+ - Disgust
9
+ - Fear
10
+ - Happiness
11
+ - Neutral
12
+ The model was trained on the Speech Emotion Recognition dataset from Kaggle, which consists of English emotional speech samples. The dataset includes audio files labeled with various emotional states, making it ideal for training models in emotion recognition tasks.
13
+
14
+ ## Model Details
15
+ - Architecture: Wav2Vec2
16
+ - Languages: English
17
+ - Dataset: Speech Emotion Recognition Dataset (Kaggle)
18
+ - Emotions Detected: Sadness, Anger, Disgust, Fear, Happiness, Neutral
19
+
20
+ ## How to Use
21
+ ### Installation
22
+ To use this model, you need to install the transformers and torchaudio packages:
23
+
24
+ ```bash
25
+ pip install transformers
26
+ pip install torchaudio
27
+ ```
28
+
29
+ ## Example Usage
30
+ Here is an example of how to use the model to classify emotions in an English audio file:
31
+
32
+ ```bash
33
+ from transformers import pipeline
34
+
35
+ # Load the fine-tuned model and feature extractor
36
+ pipe = pipeline("audio-classification", model="Khoa/w2v-speech-emotion-recognition")
37
+
38
+ # Path to your audio file
39
+ audio_file = "path_to_your_audio_file.wav"
40
+
41
+ # Perform emotion classification
42
+ predictions = pipe(audio_file)
43
+
44
+ # Map predictions to real emotion labels
45
+ label_map = {
46
+ "LABEL_0": "sadness",
47
+ "LABEL_1": "angry",
48
+ "LABEL_2": "disgust",
49
+ "LABEL_3": "fear",
50
+ "LABEL_4": "happy",
51
+ "LABEL_5": "neutral"
52
+ }
53
+
54
+ # Convert predictions to readable labels
55
+ mapped_predictions = [
56
+ {"score": pred["score"], "label": label_map[pred["label"]]}
57
+ for pred in predictions
58
+ ]
59
+
60
+ # Display results
61
+ print(mapped_predictions)
62
+ ```
63
+
64
+ ## Example Output
65
+ The model outputs a list of predictions with scores for each emotion. For example:
66
+
67
+ ```json
68
+ [
69
+ {"score": 0.95, "label": "angry"},
70
+ {"score": 0.02, "label": "happy"},
71
+ {"score": 0.01, "label": "disgust"},
72
+ {"score": 0.01, "label": "neutral"},
73
+ {"score": 0.01, "label": "fear"}
74
+ ]
75
+ ````
76
+
77
+ ## Training Details
78
+ The model was fine-tuned on the Speech Emotion Recognition Dataset, using the Wav2Vec2 architecture. The training process involved multiple epochs with a learning rate of 1e-5.
79
+
80
+ ## Limitations and Biases
81
+ This model is specifically trained on English speech data and may not perform well on other languages or dialects. Additionally, as with any machine learning model, there may be biases present in the training data that could affect the model's predictions.
82
+