Update README.md
Browse files
README.md
CHANGED
|
@@ -1,8 +1,10 @@
|
|
| 1 |
-
([简体中文](./README_zh.md)|English)
|
| 2 |
|
| 3 |
|
| 4 |
# Introduction
|
| 5 |
|
|
|
|
|
|
|
| 6 |
SenseVoice is a speech foundation model with multiple speech understanding capabilities, including automatic speech recognition (ASR), spoken language identification (LID), speech emotion recognition (SER), and audio event detection (AED).
|
| 7 |
|
| 8 |
<img src="image/sensevoice2.png">
|
|
@@ -30,7 +32,7 @@ Online Demo:
|
|
| 30 |
|
| 31 |
|
| 32 |
<a name="Highligts"></a>
|
| 33 |
-
#
|
| 34 |
**SenseVoice** focuses on high-accuracy multilingual speech recognition, speech emotion recognition, and audio event detection.
|
| 35 |
- **Multilingual Speech Recognition:** Trained with over 400,000 hours of data, supporting more than 50 languages, the recognition performance surpasses that of the Whisper model.
|
| 36 |
- **Rich transcribe:**
|
|
@@ -42,6 +44,7 @@ Online Demo:
|
|
| 42 |
|
| 43 |
<a name="What's News"></a>
|
| 44 |
# What's New 🔥
|
|
|
|
| 45 |
- 2024/7: The [SenseVoice-Small](https://www.modelscope.cn/models/iic/SenseVoiceSmall) voice understanding model is open-sourced, which offers high-precision multilingual speech recognition, emotion recognition, and audio event detection capabilities for Mandarin, Cantonese, English, Japanese, and Korean and leads to exceptionally low inference latency.
|
| 46 |
- 2024/7: The CosyVoice for natural speech generation with multi-language, timbre, and emotion control. CosyVoice excels in multi-lingual voice generation, zero-shot voice generation, cross-lingual voice cloning, and instruction-following capabilities. [CosyVoice repo](https://github.com/FunAudioLLM/CosyVoice) and [CosyVoice space](https://www.modelscope.cn/studios/iic/CosyVoice-300M).
|
| 47 |
- 2024/7: [FunASR](https://github.com/modelscope/FunASR) is a fundamental speech recognition toolkit that offers a variety of features, including speech recognition (ASR), Voice Activity Detection (VAD), Punctuation Restoration, Language Models, Speaker Verification, Speaker Diarization and multi-talker ASR.
|
|
|
|
| 1 |
+
([简体中文](./README_zh.md)|English|[日本語](./README_ja.md))
|
| 2 |
|
| 3 |
|
| 4 |
# Introduction
|
| 5 |
|
| 6 |
+
github [repo](https://github.com/FunAudioLLM/SenseVoice) : https://github.com/FunAudioLLM/SenseVoice
|
| 7 |
+
|
| 8 |
SenseVoice is a speech foundation model with multiple speech understanding capabilities, including automatic speech recognition (ASR), spoken language identification (LID), speech emotion recognition (SER), and audio event detection (AED).
|
| 9 |
|
| 10 |
<img src="image/sensevoice2.png">
|
|
|
|
| 32 |
|
| 33 |
|
| 34 |
<a name="Highligts"></a>
|
| 35 |
+
# Highlights 🎯
|
| 36 |
**SenseVoice** focuses on high-accuracy multilingual speech recognition, speech emotion recognition, and audio event detection.
|
| 37 |
- **Multilingual Speech Recognition:** Trained with over 400,000 hours of data, supporting more than 50 languages, the recognition performance surpasses that of the Whisper model.
|
| 38 |
- **Rich transcribe:**
|
|
|
|
| 44 |
|
| 45 |
<a name="What's News"></a>
|
| 46 |
# What's New 🔥
|
| 47 |
+
- 2024/7: Added Export Features for [ONNX](./demo_onnx.py) and [libtorch](./demo_libtorch.py), as well as Python Version Runtimes: [funasr-onnx-0.4.0](https://pypi.org/project/funasr-onnx/), [funasr-torch-0.1.1](https://pypi.org/project/funasr-torch/)
|
| 48 |
- 2024/7: The [SenseVoice-Small](https://www.modelscope.cn/models/iic/SenseVoiceSmall) voice understanding model is open-sourced, which offers high-precision multilingual speech recognition, emotion recognition, and audio event detection capabilities for Mandarin, Cantonese, English, Japanese, and Korean and leads to exceptionally low inference latency.
|
| 49 |
- 2024/7: The CosyVoice for natural speech generation with multi-language, timbre, and emotion control. CosyVoice excels in multi-lingual voice generation, zero-shot voice generation, cross-lingual voice cloning, and instruction-following capabilities. [CosyVoice repo](https://github.com/FunAudioLLM/CosyVoice) and [CosyVoice space](https://www.modelscope.cn/studios/iic/CosyVoice-300M).
|
| 50 |
- 2024/7: [FunASR](https://github.com/modelscope/FunASR) is a fundamental speech recognition toolkit that offers a variety of features, including speech recognition (ASR), Voice Activity Detection (VAD), Punctuation Restoration, Language Models, Speaker Verification, Speaker Diarization and multi-talker ASR.
|