buyun commited on
Commit
34a727e
·
verified ·
1 Parent(s): 854d06b

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +4 -0
README.md CHANGED
@@ -2,6 +2,9 @@
2
  license: apache-2.0
3
  ---
4
  # Step-Audio-AQAA: A Fully End-to-End Expressive Large Audio Language Model
 
 
 
5
 
6
  ## Model Overview
7
  Step-Audio-AQAA is a fully end-to-end Large Audio-Language Model (LALM) designed for Audio Query-Audio Answer (AQAA) tasks. It directly processes audio inputs and generates natural, accurate speech responses without relying on traditional ASR and TTS modules, eliminating cascading errors and simplifying the system architecture.
@@ -43,6 +46,7 @@ Step-Audio-AQAA consists of three core modules:
43
  - **AQTA Dataset**: Audio query-text answer pairs.
44
  - **AQTAA Dataset**: Audio query-text answer-audio answer triplets generated from AQTA.
45
 
 
46
  ## Citation
47
  ```bibtex
48
  @misc{huang2025stepaudioaqaa,
 
2
  license: apache-2.0
3
  ---
4
  # Step-Audio-AQAA: A Fully End-to-End Expressive Large Audio Language Model
5
+ **📚 Paper:** [Step-Audio-AQAA: a Fully End-to-End Expressive Large Audio Language Model](https://arxiv.org/abs/2506.08967)
6
+
7
+ **🚀 Live Demo:** [![Try the Demo](https://img.shields.io/badge/StepFun-Audio-AQAA)](https://www.stepfun.com/docs/zh/step-audio-aqaa?studio_code=step-audio-aqaa&studio_id=121368403356246016&studio_type=1)
8
 
9
  ## Model Overview
10
  Step-Audio-AQAA is a fully end-to-end Large Audio-Language Model (LALM) designed for Audio Query-Audio Answer (AQAA) tasks. It directly processes audio inputs and generates natural, accurate speech responses without relying on traditional ASR and TTS modules, eliminating cascading errors and simplifying the system architecture.
 
46
  - **AQTA Dataset**: Audio query-text answer pairs.
47
  - **AQTAA Dataset**: Audio query-text answer-audio answer triplets generated from AQTA.
48
 
49
+
50
  ## Citation
51
  ```bibtex
52
  @misc{huang2025stepaudioaqaa,