NTU Speech Processing & Machine Learning Lab

university

https://twitter.com/ntu_spml

Activity Feed Request to join this org

AI & ML interests

Speech Processing, Self-Supervised Learning, ASR, TTS, Voice Conversion, Spoken Question Answering

Recent Activity

dcml0714 authored a paper 3 days ago

Audio-Aware Large Language Models as Judges for Speaking Styles

kehanlu authored a paper 8 days ago

Investigating Zero-Shot Generalizability on Mandarin-English Code-Switched ASR and Speech-to-text Translation of Recent Foundation Models with Self-Supervision and Weak Supervision

kehanlu authored a paper 8 days ago

A context-aware knowledge transferring strategy for CTC-based ASR

View all activity

ntu-spml's activity

dcml0714

authored a paper 3 days ago

Audio-Aware Large Language Models as Judges for Speaking Styles

Paper • 2506.05984 • Published 6 days ago • 14

kehanlu

authored 4 papers 8 days ago

Investigating Zero-Shot Generalizability on Mandarin-English Code-Switched ASR and Speech-to-text Translation of Recent Foundation Models with Self-Supervision and Weak Supervision

Paper • 2401.00273 • Published Dec 30, 2023

A context-aware knowledge transferring strategy for CTC-based ASR

Paper • 2210.06244 • Published Oct 12, 2022

Developing Instruction-Following Speech Language Model Without Speech Instruction-Tuning Data

Paper • 2409.20007 • Published Sep 30, 2024 • 1

Dynamic-SUPERB Phase-2: A Collaboratively Expanding Benchmark for Measuring the Capabilities of Spoken Language Models with 180 Tasks

Paper • 2411.05361 • Published Nov 8, 2024 • 1

Splend1dchan

authored 9 papers 24 days ago

Extending the Pre-Training of BLOOM for Improved Support of Traditional Chinese: Models, Methods and Results

Paper • 2303.04715 • Published Mar 8, 2023

Advancing the Evaluation of Traditional Chinese Language Models: Towards a Comprehensive Benchmark Suite

Paper • 2309.08448 • Published Sep 15, 2023

Breeze-7B Technical Report

Paper • 2403.02712 • Published Mar 5, 2024

Let's Fuse Step by Step: A Generative Fusion Decoding Algorithm with LLMs for Multi-modal Text Recognition

Paper • 2405.14259 • Published May 23, 2024 • 2

Enhancing Function-Calling Capabilities in LLMs: Strategies for Prompt Formats, Data Integration, and Multilingual Translation

Paper • 2412.01130 • Published Dec 2, 2024

The Breeze 2 Herd of Models: Traditional Chinese LLMs Based on Llama with Vision-Aware and Function-Calling Capabilities

Paper • 2501.13921 • Published Jan 23 • 3

BreezyVoice: Adapting TTS for Taiwanese Mandarin with Enhanced Polyphone Disambiguation -- Challenges and Insights

Paper • 2501.17790 • Published Jan 29 • 3

FineWeb-zhtw: Scalable Curation of Traditional Chinese Text Data from the Web

Paper • 2411.16387 • Published Nov 25, 2024

Group Think: Multiple Concurrent Reasoning Agents Collaborating at Token Level Granularity

Paper • 2505.11107 • Published 27 days ago • 28

andybi7676

authored a paper about 2 months ago

TASTE: Text-Aligned Speech Tokenization and Embedding for Spoken Language Modeling

Paper • 2504.07053 • Published Apr 9 • 3

dcml0714

authored 5 papers 3 months ago

REBORN: Reinforcement-Learned Boundary Segmentation with Iterative Training for Unsupervised ASR

Paper • 2402.03988 • Published Feb 6, 2024

Advancing Large Language Models to Capture Varied Speaking Styles and Respond Properly in Spoken Conversations

Paper • 2402.12786 • Published Feb 20, 2024

AI & ML interests

Recent Activity

Team members 9

ntu-spml's activity