FireRedASR-AED-L / README.md
FireRedTeam's picture
Update README.md
926b150 verified
|
raw
history blame
902 Bytes
metadata
license: apache-2.0

FireRedASR is a family of large-scale automatic speech recognition (ASR) models supporting Mandarin, Chinese dialects and English, while also offering singing lyrics recognition capability, achieving a new state-of-the-art on public Mandarin ASR benchmarks.

FireRedASR is designed to meet diverse requirements in superior performance and optimal efficiency across various applications. It comprises two variants:

  • FireRedASR-LLM: Designed to achieve state-of-the-art (SOTA) performance and to enable seamless end-to-end speech interaction. It adopts an Encoder-Adapter-LLM framework leveraging large language model (LLM) capabilities.
  • FireRedASR-AED: Designed to balance high performance and computational efficiency and to serve as an effective speech representation module in LLM-based speech models. It utilizes an Attention-based Encoder-Decoder (AED) architecture.