File size: 1,963 Bytes
b589f2f 3d32165 b589f2f 3d32165 b589f2f 3d32165 b589f2f 4f4597e b589f2f 0be7a39 b589f2f 3d32165 b589f2f 5599459 4f174b6 5599459 b589f2f f2519b5 7fb44fc b589f2f d36bf84 dbf1efb d36bf84 b589f2f d36bf84 dbf1efb d36bf84 052eb30 d36bf84 b589f2f |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 |
---
library_name: Nvidia Nemo
license: apache-2.0
language:
- fa
pipeline_tag: automatic-speech-recognition
tags:
- Persian
- Neura
- PersianASR
datasets:
- common_voice_17_0
---
# Neura Speech Nemo
<p align="center">
<img src="neura_speech.png" width=512 height=256 />
</p>
<!-- Provide a quick summary of what the model is/does. -->
## Model Description
<!-- Provide a longer summary of what this model is. -->
- **Developed by:** Neura company
- **Funded by:** Neura
- **Model type:** fa_FastConformers_Transducer
- **Language(s) (NLP):** Persian
## Model Architecture
This model uses a FastConformer-TDT architecture. FastConformer [1] is an optimized version of the Conformer model with 8x depthwise-separable convolutional downsampling.
You may find more information on the details of FastConformer here: Fast-Conformer Model.
[Fast Conformer with Linearly Scalable Attention for Efficient
Speech Recognition](https://arxiv.org/abs/2305.05084).
## Uses
Check out the Google Colab demo to run NeuraSpeech ASR on a free-tier Google Colab instance: [](https://colab.research.google.com/drive/1kt34iFb_ez0y2SjU_km3vnzG4ccdVrXB#scrollTo=Z9DvUwmKtmR7)
make sure these packages are installed:
```
!pip install nemo_toolkit['all']
```
```python
from IPython.display import Audio, display
display(Audio('persian_audio.mp3', rate = 32_000,autoplay=True))
```
```python
import nemo
print('nemo', nemo.__version__)
import numpy as np
import nemo.collections.asr as nemo_asr
asr_model = nemo_asr.models.EncDecRNNTBPEModel.from_pretrained(model_name="Neurai/NeuraSpeech_900h")
asr_model.transcribe(paths2audio_files=['persian_audio.mp3', ], batch_size=1)[0]
```
trascribed text :
```
او خواهان آزاد کردن بردگان بود
```
## More Information
https://neura.info
## Model Card Authors
Esmaeil Zahedi, Mohsen Yazdinejad
## Model Card Contact
[email protected] |