Update README.md
Browse files
README.md
CHANGED
@@ -2,6 +2,9 @@
|
|
2 |
license: apache-2.0
|
3 |
---
|
4 |
# Step-Audio-AQAA: A Fully End-to-End Expressive Large Audio Language Model
|
|
|
|
|
|
|
5 |
|
6 |
## Model Overview
|
7 |
Step-Audio-AQAA is a fully end-to-end Large Audio-Language Model (LALM) designed for Audio Query-Audio Answer (AQAA) tasks. It directly processes audio inputs and generates natural, accurate speech responses without relying on traditional ASR and TTS modules, eliminating cascading errors and simplifying the system architecture.
|
@@ -43,6 +46,7 @@ Step-Audio-AQAA consists of three core modules:
|
|
43 |
- **AQTA Dataset**: Audio query-text answer pairs.
|
44 |
- **AQTAA Dataset**: Audio query-text answer-audio answer triplets generated from AQTA.
|
45 |
|
|
|
46 |
## Citation
|
47 |
```bibtex
|
48 |
@misc{huang2025stepaudioaqaa,
|
|
|
2 |
license: apache-2.0
|
3 |
---
|
4 |
# Step-Audio-AQAA: A Fully End-to-End Expressive Large Audio Language Model
|
5 |
+
**📚 Paper:** [Step-Audio-AQAA: a Fully End-to-End Expressive Large Audio Language Model](https://arxiv.org/abs/2506.08967)
|
6 |
+
|
7 |
+
**🚀 Live Demo:** [](https://www.stepfun.com/docs/zh/step-audio-aqaa?studio_code=step-audio-aqaa&studio_id=121368403356246016&studio_type=1)
|
8 |
|
9 |
## Model Overview
|
10 |
Step-Audio-AQAA is a fully end-to-end Large Audio-Language Model (LALM) designed for Audio Query-Audio Answer (AQAA) tasks. It directly processes audio inputs and generates natural, accurate speech responses without relying on traditional ASR and TTS modules, eliminating cascading errors and simplifying the system architecture.
|
|
|
46 |
- **AQTA Dataset**: Audio query-text answer pairs.
|
47 |
- **AQTAA Dataset**: Audio query-text answer-audio answer triplets generated from AQTA.
|
48 |
|
49 |
+
|
50 |
## Citation
|
51 |
```bibtex
|
52 |
@misc{huang2025stepaudioaqaa,
|