Amr-khaled commited on
Commit
2628d40
·
verified ·
1 Parent(s): 532f63e

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +15 -24
README.md CHANGED
@@ -39,28 +39,6 @@ The model transcribes text in Arabic without diacritical marks and supports peri
39
 
40
  This model is ready for commercial and non-commercial use.
41
 
42
- ## License
43
-
44
- License to use this model is covered by the [CC-BY-4.0](https://creativecommons.org/licenses/by/4.0/). By downloading the public and release version of the model, you accept the terms and conditions of the [CC-BY-4.0](https://creativecommons.org/licenses/by/4.0/) license.
45
-
46
- ## References
47
-
48
- [1] [Fast Conformer with Linearly Scalable Attention for Efficient Speech Recognition](https://arxiv.org/abs/2305.05084)
49
-
50
- [2] [Google Sentencepiece Tokenizer](https://github.com/google/sentencepiece)
51
-
52
- [3] [NVIDIA NeMo Toolkit](https://github.com/NVIDIA/NeMo)
53
-
54
- [4] [Open Universal Arabic ASR Leaderboard](https://huggingface.co/spaces/elmresearchcenter/open_universal_arabic_asr_leaderboard)
55
-
56
- <!-- ## NVIDIA NeMo: Training
57
-
58
- To train, fine-tune or play with the model you will need to install [NVIDIA NeMo](https://github.com/NVIDIA/NeMo).
59
- We recommend you install it after you've installed latest Pytorch version.
60
- ```
61
- pip install nemo_toolkit['all']
62
- ```
63
- -->
64
  ## Model Architecture
65
 
66
  FastConformer [1] is an optimized version of the Conformer model with 8x depthwise-separable convolutional downsampling.
@@ -83,7 +61,6 @@ This model provides transcribed speech as a string for a given audio sample.
83
  - **Other Properties Related to Output:** May Need Inverse Text Normalization; Does Not Handle Special Characters; Outputs text in Arabic without diacritical marks
84
 
85
  ## Limitations
86
-
87
  The model is non-streaming and outputs the speech as a string without diacritical marks.
88
  Not recommended for word-for-word transcription and punctuation as accuracy varies based on the characteristics of input audio (unrecognized word, accent, noise, speech type, and context of speech).
89
 
@@ -200,4 +177,18 @@ asr_model.transcribe(['sample_audio_1.wav', 'sample_audio_2.wav', 'sample_audio_
200
  - Model outputs text in Arabic without diacritical marks
201
  - Output text requires Inverse Text Normalization
202
  - The model is noise-sensitive
203
- - The model is Egyptian Dialect further finetuned
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
39
 
40
  This model is ready for commercial and non-commercial use.
41
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
42
  ## Model Architecture
43
 
44
  FastConformer [1] is an optimized version of the Conformer model with 8x depthwise-separable convolutional downsampling.
 
61
  - **Other Properties Related to Output:** May Need Inverse Text Normalization; Does Not Handle Special Characters; Outputs text in Arabic without diacritical marks
62
 
63
  ## Limitations
 
64
  The model is non-streaming and outputs the speech as a string without diacritical marks.
65
  Not recommended for word-for-word transcription and punctuation as accuracy varies based on the characteristics of input audio (unrecognized word, accent, noise, speech type, and context of speech).
66
 
 
177
  - Model outputs text in Arabic without diacritical marks
178
  - Output text requires Inverse Text Normalization
179
  - The model is noise-sensitive
180
+ - The model is Egyptian Dialect further finetuned
181
+
182
+ ## License
183
+
184
+ License to use this model is covered by the [CC-BY-4.0](https://creativecommons.org/licenses/by/4.0/). By downloading the public and release version of the model, you accept the terms and conditions of the [CC-BY-4.0](https://creativecommons.org/licenses/by/4.0/) license.
185
+
186
+ ## References
187
+
188
+ [1] [Fast Conformer with Linearly Scalable Attention for Efficient Speech Recognition](https://arxiv.org/abs/2305.05084)
189
+
190
+ [2] [Google Sentencepiece Tokenizer](https://github.com/google/sentencepiece)
191
+
192
+ [3] [NVIDIA NeMo Toolkit](https://github.com/NVIDIA/NeMo)
193
+
194
+ [4] [Open Universal Arabic ASR Leaderboard](https://huggingface.co/spaces/elmresearchcenter/open_universal_arabic_asr_leaderboard)