SoloAudio: Target Sound Extraction with Language-oriented Audio Diffusion Transformer Paper • 2409.08425 • Published Sep 12, 2024 • 10
CapSpeech: Enabling Downstream Applications in Style-Captioned Text-to-Speech Paper • 2506.02863 • Published Jun 3 • 8
Noise-robust Speech Separation with Fast Generative Correction Paper • 2406.07461 • Published Jun 11, 2024
SoloSpeech: Enhancing Intelligibility and Quality in Target Speech Extraction through a Cascaded Generative Pipeline Paper • 2505.19314 • Published May 25 • 4
CapSpeech: Enabling Downstream Applications in Style-Captioned Text-to-Speech Paper • 2506.02863 • Published Jun 3 • 8
The Language Barrier: Dissecting Safety Challenges of LLMs in Multilingual Contexts Paper • 2401.13136 • Published Jan 23, 2024
Certified Mitigation of Worst-Case LLM Copyright Infringement Paper • 2504.16046 • Published Apr 22 • 14
Source-Free and Image-Only Unsupervised Domain Adaptation for Category Level Object Pose Estimation Paper • 2401.10848 • Published Jan 19, 2024
Animal3D: A Comprehensive Dataset of 3D Animal Pose and Shape Paper • 2308.11737 • Published Aug 22, 2023 • 2
Gaussian Scenes: Pose-Free Sparse-View Scene Reconstruction using Depth-Enhanced Diffusion Priors Paper • 2411.15966 • Published Nov 24, 2024
The Stochastic Parrot on LLM's Shoulder: A Summative Assessment of Physical Concept Understanding Paper • 2502.08946 • Published Feb 13 • 195
Dynamic Relation Transformer for Contextual Text Block Detection Paper • 2401.09232 • Published Jan 17, 2024
The Essence of Contextual Understanding in Theory of Mind: A Study on Question Answering with Story Characters Paper • 2501.01705 • Published Jan 3
Few-Shot Character Understanding in Movies as an Assessment to Meta-Learning of Theory-of-Mind Paper • 2211.04684 • Published Nov 9, 2022
3DSRBench: A Comprehensive 3D Spatial Reasoning Benchmark Paper • 2412.07825 • Published Dec 10, 2024 • 12