asadullah797 commited on
Commit
972d2ba
·
verified ·
1 Parent(s): 7e41af2

Push model using huggingface_hub.

Browse files
Files changed (2) hide show
  1. README.md +4 -40
  2. model.safetensors +1 -1
README.md CHANGED
@@ -9,43 +9,7 @@ tags:
9
  - speaker-identification
10
  ---
11
 
12
- Multitask Speech Model with Wav2Vec2
13
-
14
- This repository contains a multitask learning pipeline built on top of Wav2Vec2
15
- , designed to jointly perform:
16
-
17
- Automatic Speech Recognition (ASR) (character-level CTC loss)
18
-
19
- Speaker Identification
20
-
21
- Emotion Recognition
22
-
23
- The system is trained on a combination of training dataset with parallel data from speech transcriptions, speaker identification and emotion recognition labels.
24
-
25
- 📌 Features
26
-
27
- Multitask model (Wav2Vec2MultiTasks) with shared Wav2Vec2 encoder and separate heads for:
28
-
29
- Speech Recognition (CTC)
30
-
31
- Speaker classification
32
-
33
- Emotion classification
34
-
35
- Custom data preprocessing:
36
-
37
- Cleans transcripts (removes punctuation & special characters)
38
-
39
- Converts numbers into words
40
-
41
- Builds a vocabulary and tokenizer
42
-
43
- Filters short/invalid audio
44
-
45
- Training, validation, and test splits with collators for CTC.
46
-
47
- Evaluation metrics:
48
-
49
- Character Error Rate (CER) for character recognition
50
-
51
- Accuracy for speaker and emotion classification
 
9
  - speaker-identification
10
  ---
11
 
12
+ This model has been pushed to the Hub using the [PytorchModelHubMixin](https://huggingface.co/docs/huggingface_hub/package_reference/mixins#huggingface_hub.PyTorchModelHubMixin) integration:
13
+ - Code: https://huggingface.co/asadullah797/ssl-semi-multitask
14
+ - Paper: [More Information Needed]
15
+ - Docs: https://github.com/asadullah797/ssl_semi-multitask/blob/main/README.md
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:ae533279a661450c6fb7b60a7e2052b262bd0260bf7ca144e05ed7c7bed109bb
3
  size 378804760
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:c56ddf35a0ec7104eed113e1ffd022ba1e38a35e77403b71e06a29be01dd3791
3
  size 378804760