alkiskoudounas commited on
Commit
3df34d8
·
verified ·
1 Parent(s): b0dfe0d

Updated README

Browse files
Files changed (1) hide show
  1. README.md +16 -7
README.md CHANGED
@@ -30,13 +30,23 @@ This model continues pre-training from a [model](https://huggingface.co/ALM/wav2
30
 
31
  We evaluate voc2vec-as-pt on six datasets: ASVP-ESD, ASPV-ESD (babies), CNVVE, NonVerbal Vocalization Dataset, Donate a Cry, VIVAE.
32
 
 
 
 
 
 
 
 
 
 
33
  ## Available Models
34
 
35
  | Model | Description | Link |
36
  |--------|-------------|------|
37
  | **voc2vec** | Pre-trained model on **125 hours of non-verbal audio**. | [🔗 Model](https://huggingface.co/alkiskoudounas/voc2vec) |
38
- | **voc2vec-as-pt** | Continues pre-training from a model that was **initially trained on the AudioSet dataset**. | [🔗 Model](https://huggingface.co/alkiskoudounas/voc2vec-as-pt) |
39
- | **voc2vec-ls-pt** | Continues pre-training from a model that was **initially trained on the LibriSpeech dataset**. | [🔗 Model](https://huggingface.co/alkiskoudounas/voc2vec-ls-pt) |
 
40
 
41
  ## Usage examples
42
 
@@ -65,13 +75,12 @@ logits = model(**inputs).logits
65
  ```bibtex
66
  @INPROCEEDINGS{koudounas2025icassp,
67
  author={Koudounas, Alkis and La Quatra, Moreno and Siniscalchi, Sabato Marco and Baralis, Elena},
68
- booktitle={ICASSP 2025 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)},
69
  title={voc2vec: A Foundation Model for Non-Verbal Vocalization},
70
  year={2025},
71
  volume={},
72
  number={},
73
- pages={},
74
- keywords={},
75
- doi={}}
76
-
77
  ```
 
30
 
31
  We evaluate voc2vec-as-pt on six datasets: ASVP-ESD, ASPV-ESD (babies), CNVVE, NonVerbal Vocalization Dataset, Donate a Cry, VIVAE.
32
 
33
+ The following table reports the average performance in terms of Unweighted Average Recall (UAR) and F1 Macro across the six datasets described above.
34
+
35
+ | Model | Architecture | Pre-training DS | UAR | F1 Macro |
36
+ |--------|-------------|-------------|-----------|-----------|
37
+ | **voc2vec** | wav2vec 2.0 | Voc125 | .612±.212 | .580±.230 |
38
+ | **voc2vec-as-pt** | wav2vec 2.0 | AudioSet + Voc125 | .603±.183 | .574±.194 |
39
+ | **voc2vec-ls-pt** | wav2vec 2.0 | LibriSpeech + Voc125 | .661±.206 | .636±.223 |
40
+ | **voc2vec-hubert-ls-pt** | HuBERT | LibriSpeech + Voc125 | **.696±.189** | **.678±.200** |
41
+
42
  ## Available Models
43
 
44
  | Model | Description | Link |
45
  |--------|-------------|------|
46
  | **voc2vec** | Pre-trained model on **125 hours of non-verbal audio**. | [🔗 Model](https://huggingface.co/alkiskoudounas/voc2vec) |
47
+ | **voc2vec-as-pt** | Continues pre-training from a wav2vec2-like model that was **initially trained on the AudioSet dataset**. | [🔗 Model](https://huggingface.co/alkiskoudounas/voc2vec-as-pt) |
48
+ | **voc2vec-ls-pt** | Continues pre-training from a wav2vec2-like model that was **initially trained on the LibriSpeech dataset**. | [🔗 Model](https://huggingface.co/alkiskoudounas/voc2vec-ls-pt) |
49
+ | **voc2vec-hubert-ls-pt** | Continues pre-training from a hubert-like model that was **initially trained on the LibriSpeech dataset**. | [🔗 Model](https://huggingface.co/alkiskoudounas/voc2vec-hubert-ls-pt) |
50
 
51
  ## Usage examples
52
 
 
75
  ```bibtex
76
  @INPROCEEDINGS{koudounas2025icassp,
77
  author={Koudounas, Alkis and La Quatra, Moreno and Siniscalchi, Sabato Marco and Baralis, Elena},
78
+ booktitle={ICASSP 2025 - 2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)},
79
  title={voc2vec: A Foundation Model for Non-Verbal Vocalization},
80
  year={2025},
81
  volume={},
82
  number={},
83
+ pages={1-5},
84
+ keywords={Pediatrics;Accuracy;Foundation models;Benchmark testing;Signal processing;Data models;Acoustics;Speech processing;Nonverbal vocalization;Representation Learning;Self-Supervised Models;Pre-trained Models},
85
+ doi={10.1109/ICASSP49660.2025.10890672}}
 
86
  ```