tiantiaf
/

whisper-large-v3-speech-flow

Audio Classification

model_hub_mixin

pytorch_model_hub_mixin

Model card Files Files and versions Community

tiantiaf commited on May 22

Commit

bac4109

·

verified ·

1 Parent(s): 1b1d214

Update README.md

Files changed (1) hide show

README.md +22 -3

README.md CHANGED Viewed

@@ -2,9 +2,28 @@
 tags:
 - model_hub_mixin
 - pytorch_model_hub_mixin
 ---
-This model has been pushed to the Hub using the [PytorchModelHubMixin](https://huggingface.co/docs/huggingface_hub/package_reference/mixins#huggingface_hub.PyTorchModelHubMixin) integration:
-- Code: https://github.com/tiantiaf0627/vox-profile-release
-- Paper: [More Information Needed]
 - Docs: [More Information Needed]

 tags:
 - model_hub_mixin
 - pytorch_model_hub_mixin
+license: apache-2.0
+language:
+- en
+metrics:
+- accuracy
+base_model:
+- openai/whisper-large-v3
+pipeline_tag: audio-classification
 ---
+# Whisper Large v3 for Speech Flow (Fluency) Classification
+# Model Description
+This model includes the implementation of speech fluency classification described in Vox-Profile: A Speech Foundation Model Benchmark for Characterizing Diverse Speaker and Speech Traits (https://arxiv.org/pdf/2505.14648)
+The model first predicts the speech in ["fluent", "disfluent"] with 3-second window size and 1-second step size
+If the disfluent speech is detected, we predict the disfluent types in: [
+  "Block",
+  "Prolongation",
+  "Sound Repetition",
+  "Word Repetition",
+  "Interjection"
+]
+- Library: https://github.com/tiantiaf0627/vox-profile-release
 - Docs: [More Information Needed]