i need to verify this pretrained model on another datasets xith multiple wave file how to arrange the code

#7
by messaoudi - opened
Files changed (1) hide show
  1. README.md +3 -15
README.md CHANGED
@@ -15,20 +15,8 @@ pipeline_tag: audio-classification
15
 
16
  # Model for Dimensional Speech Emotion Recognition based on Wav2vec 2.0
17
 
18
- Please note that this model is for research purpose only.
19
- A commercial license for a model
20
- that has been trained on much more data
21
- can be acquired with [audEERING](https://www.audeering.com/products/devaice/).
22
- The model expects a raw audio signal as input,
23
- and outputs predictions for arousal, dominance and valence in a range of approximately 0...1.
24
- In addition,
25
- it provides the pooled states of the last transformer layer.
26
- The model was created by fine-tuning
27
- [Wav2Vec2-Large-Robust](https://huggingface.co/facebook/wav2vec2-large-robust)
28
- on [MSP-Podcast](https://ecs.utdallas.edu/research/researchlabs/msp-lab/MSP-Podcast.html) (v1.7).
29
- The model was pruned from 24 to 12 transformer layers before fine-tuning.
30
- An [ONNX](https://onnx.ai/) export of the model is available from [doi:10.5281/zenodo.6221127](https://zenodo.org/record/6221127).
31
- Further details are given in the associated [paper](https://arxiv.org/abs/2203.07378) and [tutorial](https://github.com/audeering/w2v2-how-to).
32
 
33
  # Usage
34
 
@@ -96,7 +84,7 @@ class EmotionModel(Wav2Vec2PreTrainedModel):
96
  device = 'cpu'
97
  model_name = 'audeering/wav2vec2-large-robust-12-ft-emotion-msp-dim'
98
  processor = Wav2Vec2Processor.from_pretrained(model_name)
99
- model = EmotionModel.from_pretrained(model_name).to(device)
100
 
101
  # dummy signal
102
  sampling_rate = 16000
 
15
 
16
  # Model for Dimensional Speech Emotion Recognition based on Wav2vec 2.0
17
 
18
+ The model expects a raw audio signal as input and outputs predictions for arousal, dominance and valence in a range of approximately 0...1. In addition, it also provides the pooled states of the last transformer layer. The model was created by fine-tuning [
19
+ Wav2Vec2-Large-Robust](https://huggingface.co/facebook/wav2vec2-large-robust) on [MSP-Podcast](https://ecs.utdallas.edu/research/researchlabs/msp-lab/MSP-Podcast.html) (v1.7). The model was pruned from 24 to 12 transformer layers before fine-tuning. An [ONNX](https://onnx.ai/") export of the model is available from [doi:10.5281/zenodo.6221127](https://zenodo.org/record/6221127). Further details are given in the associated [paper](https://arxiv.org/abs/2203.07378) and [tutorial](https://github.com/audeering/w2v2-how-to).
 
 
 
 
 
 
 
 
 
 
 
 
20
 
21
  # Usage
22
 
 
84
  device = 'cpu'
85
  model_name = 'audeering/wav2vec2-large-robust-12-ft-emotion-msp-dim'
86
  processor = Wav2Vec2Processor.from_pretrained(model_name)
87
+ model = EmotionModel.from_pretrained(model_name)
88
 
89
  # dummy signal
90
  sampling_rate = 16000