Dionyssos commited on
Commit
6ea3fe5
·
1 Parent(s): b93e7b7

ckpt download speed

Browse files
Files changed (1) hide show
  1. README.md +4 -5
README.md CHANGED
@@ -12,7 +12,7 @@ tags:
12
 
13
  # Wav2Small2.0 - Arousal / Dominance / Valence
14
 
15
- Please note that this model is for research purpose only. A commercial [license](https://www.audeering.com/products/devaice/) can be acquired with audEERING. The model expects a raw audio signal 16KHz as input, and outputs: arousal, dominance valence in range [0, 1], as well as Anger/Happiness/Neutral/Sad probability. The model is created following the [Wav2Small paper](https://arxiv.org/abs/2408.13920) and has a total of 17K params.
16
 
17
 
18
  # How To
@@ -20,11 +20,10 @@ Please note that this model is for research purpose only. A commercial [license]
20
  ```python
21
  import torch
22
  import numpy as np
23
- import torch.nn.functional as F
24
  import librosa
25
- from transformers.models.wav2vec2.modeling_wav2vec2 import Wav2Vec2PreTrainedModel
26
  from torch import nn
27
- from transformers import PretrainedConfig
28
 
29
 
30
 
@@ -97,7 +96,7 @@ class Spectrogram(nn.Module):
97
 
98
  real = self.conv_real(x)
99
  imag = self.conv_imag(x)
100
- return real ** 2 + imag ** 2 # bs, mel, time-frames
101
 
102
 
103
  class LogmelFilterBank(nn.Module):
 
12
 
13
  # Wav2Small2.0 - Arousal / Dominance / Valence
14
 
15
+ Please note that this model is for research purpose only. A commercial [license](https://www.audeering.com/products/devaice/) can be acquired with audEERING. The model expects a raw audio signal 16KHz as input, and outputs: arousal, dominance valence in range [0, 1]. The model is created following the [Wav2Small paper](https://arxiv.org/abs/2408.13920) and has a total of 17K params.
16
 
17
 
18
  # How To
 
20
  ```python
21
  import torch
22
  import numpy as np
 
23
  import librosa
24
+ from transformers import Wav2Vec2PreTrainedModel, PretrainedConfig
25
  from torch import nn
26
+
27
 
28
 
29
 
 
96
 
97
  real = self.conv_real(x)
98
  imag = self.conv_imag(x)
99
+ return real ** 2 + imag ** 2 # bs, freq, time-frames
100
 
101
 
102
  class LogmelFilterBank(nn.Module):