File size: 1,963 Bytes
b589f2f
3d32165
b589f2f
 
 
3d32165
b589f2f
3d32165
 
 
 
 
b589f2f
4f4597e
b589f2f
 
0be7a39
b589f2f
 
 
 
 
 
 
 
 
 
3d32165
b589f2f
 
5599459
 
4f174b6
 
5599459
 
b589f2f
 
f2519b5
7fb44fc
 
b589f2f
d36bf84
 
 
 
dbf1efb
 
 
 
 
d36bf84
 
 
 
 
b589f2f
d36bf84
dbf1efb
d36bf84
 
 
 
052eb30
d36bf84
b589f2f
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
---
library_name: Nvidia Nemo
license: apache-2.0
language:
- fa
pipeline_tag: automatic-speech-recognition
tags:
- Persian
- Neura
- PersianASR
datasets: 
- common_voice_17_0
---
# Neura Speech Nemo

<p align="center">
  <img src="neura_speech.png" width=512 height=256 />
</p>

<!-- Provide a quick summary of what the model is/does. -->

## Model Description

<!-- Provide a longer summary of what this model is. -->

- **Developed by:** Neura company
- **Funded by:** Neura
- **Model type:** fa_FastConformers_Transducer
- **Language(s) (NLP):** Persian

## Model Architecture

This model uses a FastConformer-TDT architecture. FastConformer [1] is an optimized version of the Conformer model with 8x depthwise-separable convolutional downsampling.
You may find more information on the details of FastConformer here: Fast-Conformer Model.
[Fast Conformer with Linearly Scalable Attention for Efficient
Speech Recognition](https://arxiv.org/abs/2305.05084).

## Uses
Check out the Google Colab demo to run NeuraSpeech ASR on a free-tier Google Colab instance: [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1kt34iFb_ez0y2SjU_km3vnzG4ccdVrXB#scrollTo=Z9DvUwmKtmR7)



make sure these packages are installed:
```
!pip install nemo_toolkit['all']
```
```python
from IPython.display import Audio, display
display(Audio('persian_audio.mp3', rate = 32_000,autoplay=True))
```

```python
import nemo
print('nemo', nemo.__version__)
import numpy as np
import nemo.collections.asr as nemo_asr

asr_model = nemo_asr.models.EncDecRNNTBPEModel.from_pretrained(model_name="Neurai/NeuraSpeech_900h")
asr_model.transcribe(paths2audio_files=['persian_audio.mp3', ], batch_size=1)[0]

```
trascribed text :
```
او خواهان آزاد کردن بردگان بود
```


## More Information
https://neura.info

## Model Card Authors
Esmaeil Zahedi, Mohsen Yazdinejad

## Model Card Contact
[email protected]