Update README.md
Browse files
README.md
CHANGED
@@ -2,9 +2,28 @@
|
|
2 |
tags:
|
3 |
- model_hub_mixin
|
4 |
- pytorch_model_hub_mixin
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
5 |
---
|
|
|
6 |
|
7 |
-
|
8 |
-
-
|
9 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
10 |
- Docs: [More Information Needed]
|
|
|
2 |
tags:
|
3 |
- model_hub_mixin
|
4 |
- pytorch_model_hub_mixin
|
5 |
+
license: apache-2.0
|
6 |
+
language:
|
7 |
+
- en
|
8 |
+
metrics:
|
9 |
+
- accuracy
|
10 |
+
base_model:
|
11 |
+
- openai/whisper-large-v3
|
12 |
+
pipeline_tag: audio-classification
|
13 |
---
|
14 |
+
# Whisper Large v3 for Speech Flow (Fluency) Classification
|
15 |
|
16 |
+
# Model Description
|
17 |
+
This model includes the implementation of speech fluency classification described in Vox-Profile: A Speech Foundation Model Benchmark for Characterizing Diverse Speaker and Speech Traits (https://arxiv.org/pdf/2505.14648)
|
18 |
+
|
19 |
+
The model first predicts the speech in ["fluent", "disfluent"] with 3-second window size and 1-second step size
|
20 |
+
If the disfluent speech is detected, we predict the disfluent types in: [
|
21 |
+
"Block",
|
22 |
+
"Prolongation",
|
23 |
+
"Sound Repetition",
|
24 |
+
"Word Repetition",
|
25 |
+
"Interjection"
|
26 |
+
]
|
27 |
+
|
28 |
+
- Library: https://github.com/tiantiaf0627/vox-profile-release
|
29 |
- Docs: [More Information Needed]
|