alkiskoudounas commited on
Commit
21c5653
·
verified ·
1 Parent(s): 5694805

Updated README

Browse files
Files changed (1) hide show
  1. README.md +18 -8
README.md CHANGED
@@ -28,15 +28,26 @@ This model continues pre-training from a [model](https://huggingface.co/facebook
28
 
29
  ## Task and datasets description
30
 
31
- We evaluate voc2vec-as-pt on six datasets: ASVP-ESD, ASPV-ESD (babies), CNVVE, NonVerbal Vocalization Dataset, Donate a Cry, VIVAE.
 
 
 
 
 
 
 
 
 
 
32
 
33
  ## Available Models
34
 
35
  | Model | Description | Link |
36
  |--------|-------------|------|
37
  | **voc2vec** | Pre-trained model on **125 hours of non-verbal audio**. | [🔗 Model](https://huggingface.co/alkiskoudounas/voc2vec) |
38
- | **voc2vec-as-pt** | Continues pre-training from a model that was **initially trained on the AudioSet dataset**. | [🔗 Model](https://huggingface.co/alkiskoudounas/voc2vec-as-pt) |
39
- | **voc2vec-ls-pt** | Continues pre-training from a model that was **initially trained on the LibriSpeech dataset**. | [🔗 Model](https://huggingface.co/alkiskoudounas/voc2vec-ls-pt) |
 
40
 
41
  ## Usage examples
42
 
@@ -65,13 +76,12 @@ logits = model(**inputs).logits
65
  ```bibtex
66
  @INPROCEEDINGS{koudounas2025icassp,
67
  author={Koudounas, Alkis and La Quatra, Moreno and Siniscalchi, Sabato Marco and Baralis, Elena},
68
- booktitle={ICASSP 2025 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)},
69
  title={voc2vec: A Foundation Model for Non-Verbal Vocalization},
70
  year={2025},
71
  volume={},
72
  number={},
73
- pages={},
74
- keywords={},
75
- doi={}}
76
-
77
  ```
 
28
 
29
  ## Task and datasets description
30
 
31
+ We evaluate voc2vec-ls-pt on six datasets: ASVP-ESD, ASPV-ESD (babies), CNVVE, NonVerbal Vocalization Dataset, Donate a Cry, VIVAE.
32
+
33
+ The following table reports the average performance in terms of Unweighted Average Recall (UAR) and F1 Macro across the six datasets described above.
34
+
35
+ | Model | Architecture | Pre-training DS | UAR | F1 Macro |
36
+ |--------|-------------|-------------|-----------|-----------|
37
+ | **voc2vec** | wav2vec 2.0 | Voc125 | .612±.212 | .580±.230 |
38
+ | **voc2vec-as-pt** | wav2vec 2.0 | AudioSet + Voc125 | .603±.183 | .574±.194 |
39
+ | **voc2vec-ls-pt** | wav2vec 2.0 | LibriSpeech + Voc125 | .661±.206 | .636±.223 |
40
+ | **voc2vec-hubert-ls-pt** | HuBERT | LibriSpeech + Voc125 | **.696±.189** | **.678±.200** |
41
+
42
 
43
  ## Available Models
44
 
45
  | Model | Description | Link |
46
  |--------|-------------|------|
47
  | **voc2vec** | Pre-trained model on **125 hours of non-verbal audio**. | [🔗 Model](https://huggingface.co/alkiskoudounas/voc2vec) |
48
+ | **voc2vec-as-pt** | Continues pre-training from a wav2vec2-like model that was **initially trained on the AudioSet dataset**. | [🔗 Model](https://huggingface.co/alkiskoudounas/voc2vec-as-pt) |
49
+ | **voc2vec-ls-pt** | Continues pre-training from a wav2vec2-like model that was **initially trained on the LibriSpeech dataset**. | [🔗 Model](https://huggingface.co/alkiskoudounas/voc2vec-ls-pt) |
50
+ | **voc2vec-hubert-ls-pt** | Continues pre-training from a hubert-like model that was **initially trained on the LibriSpeech dataset**. | [🔗 Model](https://huggingface.co/alkiskoudounas/voc2vec-hubert-ls-pt) |
51
 
52
  ## Usage examples
53
 
 
76
  ```bibtex
77
  @INPROCEEDINGS{koudounas2025icassp,
78
  author={Koudounas, Alkis and La Quatra, Moreno and Siniscalchi, Sabato Marco and Baralis, Elena},
79
+ booktitle={ICASSP 2025 - 2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)},
80
  title={voc2vec: A Foundation Model for Non-Verbal Vocalization},
81
  year={2025},
82
  volume={},
83
  number={},
84
+ pages={1-5},
85
+ keywords={Pediatrics;Accuracy;Foundation models;Benchmark testing;Signal processing;Data models;Acoustics;Speech processing;Nonverbal vocalization;Representation Learning;Self-Supervised Models;Pre-trained Models},
86
+ doi={10.1109/ICASSP49660.2025.10890672}}
 
87
  ```