nguyenvulebinh commited on
Commit
0337cce
·
verified ·
1 Parent(s): 7cc5b03

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +71 -1
README.md CHANGED
@@ -51,7 +51,65 @@ python infer.py
51
  - 14k audio hours
52
  - English only
53
 
54
- Dataset is open available in [HF Dataset](https://huggingface.co/datasets/nguyenvulebinh/spk-attribute)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
55
 
56
  ### Citation
57
 
@@ -67,6 +125,18 @@ Dataset is open available in [HF Dataset](https://huggingface.co/datasets/nguyen
67
  keywords={Training;Adaptation models;Limiting;Predictive models;Data models;Robustness;Multilingual;Data mining;Speech processing;Standards;speaker-attributed;asr;multilingual},
68
  doi={10.1109/ICASSP49660.2025.10889116}}
69
 
 
 
 
 
 
 
 
 
 
 
 
 
70
  ```
71
 
72
  ### License
 
51
  - 14k audio hours
52
  - English only
53
 
54
+ Dataset is openly available in [HF Dataset](https://huggingface.co/datasets/nguyenvulebinh/spk-attribute)
55
+
56
+ *Example*
57
+
58
+ Audio
59
+
60
+ <audio controls>
61
+ <source src="https://huggingface.co/nguyenvulebinh/MSA-ASR/resolve/main/sample_augment.wav" type="audio/wav">
62
+ Your browser does not support the audio element.
63
+ </audio>
64
+
65
+
66
+ Label:
67
+
68
+ ```code
69
+ spk_1 A 0.00 1.58 »spk_1
70
+ spk_1 A 0.00 1.58 Pacifica
71
+ spk_1 A 1.58 0.68 continues
72
+ spk_1 A 2.27 0.52 today
73
+ spk_1 A 2.79 0.24 to
74
+ spk_1 A 3.03 0.20 be
75
+ spk_1 A 3.23 0.14 a
76
+ spk_1 A 3.37 0.54 listener
77
+ spk_1 A 3.91 0.80 supported
78
+ spk_1 A 4.71 0.70 network
79
+ spk_1 A 5.42 0.38 of
80
+ spk_2 A 5.80 0.12 »spk_2
81
+ spk_2 A 5.80 0.12 At
82
+ spk_2 A 5.92 0.42 home,
83
+ spk_2 A 6.34 0.18 an
84
+ spk_2 A 6.52 0.38 Aed
85
+ spk_2 A 6.90 0.26 is
86
+ spk_2 A 7.16 0.18 an
87
+ spk_2 A 7.34 0.56 automated
88
+ spk_2 A 7.90 0.60 external
89
+ spk_2 A 8.50 0.90 defibrillator.
90
+ spk_2 A 9.40 0.40 It's
91
+ spk_2 A 9.81 0.08 the
92
+ spk_2 A 9.89 0.36 device
93
+ spk_2 A 10.25 0.08 you
94
+ spk_2 A 10.33 0.16 use
95
+ spk_2 A 10.49 0.12 when
96
+ spk_2 A 10.61 0.10 your
97
+ spk_2 A 10.73 0.16 heart
98
+ spk_2 A 10.89 0.18 goes
99
+ spk_2 A 11.07 0.12 into
100
+ spk_2 A 11.19 0.38 cardiac
101
+ spk_2 A 11.57 0.38 arrest
102
+ spk_2 A 11.95 0.18 to
103
+ spk_2 A 12.13 0.36 shock
104
+ spk_2 A 12.49 0.14 it
105
+ spk_2 A 12.63 0.28 back
106
+ spk_2 A 12.91 0.22 into
107
+ spk_2 A 13.13 0.06 a
108
+ spk_2 A 13.19 0.32 normal
109
+ spk_2 A 13.51 0.88 rhythm.
110
+ spk_1 A 14.40 1.38 »spk_1
111
+ spk_1 A 14.40 1.38 stations.
112
+ ```
113
 
114
  ### Citation
115
 
 
125
  keywords={Training;Adaptation models;Limiting;Predictive models;Data models;Robustness;Multilingual;Data mining;Speech processing;Standards;speaker-attributed;asr;multilingual},
126
  doi={10.1109/ICASSP49660.2025.10889116}}
127
 
128
+ @INPROCEEDINGS{10446589,
129
+ author={Nguyen, Thai-Binh and Waibel, Alexander},
130
+ booktitle={ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)},
131
+ title={Synthetic Conversations Improve Multi-Talker ASR},
132
+ year={2024},
133
+ volume={},
134
+ number={},
135
+ pages={10461-10465},
136
+ keywords={Systematics;Error analysis;Knowledge based systems;Oral communication;Signal processing;Data models;Acoustics;multi-talker;asr;synthetic conversation},
137
+ doi={10.1109/ICASSP48485.2024.10446589}}
138
+
139
+
140
  ```
141
 
142
  ### License