BreezyVoice: Adapting TTS for Taiwanese Mandarin with Enhanced Polyphone Disambiguation -- Challenges and Insights Paper • 2501.17790 • Published Jan 29 • 3
Group Think: Multiple Concurrent Reasoning Agents Collaborating at Token Level Granularity Paper • 2505.11107 • Published May 16 • 29
TASTE: Text-Aligned Speech Tokenization and Embedding for Spoken Language Modeling Paper • 2504.07053 • Published Apr 9 • 4
The Breeze 2 Herd of Models: Traditional Chinese LLMs Based on Llama with Vision-Aware and Function-Calling Capabilities Paper • 2501.13921 • Published Jan 23 • 3
Zero-shot Domain-sensitive Speech Recognition with Prompt-conditioning Fine-tuning Paper • 2307.10274 • Published Jul 18, 2023
Advancing the Evaluation of Traditional Chinese Language Models: Towards a Comprehensive Benchmark Suite Paper • 2309.08448 • Published Sep 15, 2023
Let's Fuse Step by Step: A Generative Fusion Decoding Algorithm with LLMs for Multi-modal Text Recognition Paper • 2405.14259 • Published May 23, 2024 • 2
Enhancing Function-Calling Capabilities in LLMs: Strategies for Prompt Formats, Data Integration, and Multilingual Translation Paper • 2412.01130 • Published Dec 2, 2024 • 1