Don't Overthink it. Preferring Shorter Thinking Chains for Improved LLM Reasoning Paper • 2505.17813 • Published 14 days ago • 54
Speaker Normalization for Self-supervised Speech Emotion Recognition Paper • 2202.01252 • Published Feb 2, 2022
D-Flow: Differentiating through Flows for Controlled Generation Paper • 2402.14017 • Published Feb 21, 2024 • 8
D-Flow: Differentiating through Flows for Controlled Generation Paper • 2402.14017 • Published Feb 21, 2024 • 8
SpiRit-LM: Interleaved Spoken and Written Language Model Paper • 2402.05755 • Published Feb 8, 2024 • 15
Masked Audio Generation using a Single Non-Autoregressive Transformer Paper • 2401.04577 • Published Jan 9, 2024 • 44
Masked Audio Generation using a Single Non-Autoregressive Transformer Paper • 2401.04577 • Published Jan 9, 2024 • 44
Diverse and Aligned Audio-to-Video Generation via Text-to-Video Model Adaptation Paper • 2309.16429 • Published Sep 28, 2023 • 11
Diverse and Aligned Audio-to-Video Generation via Text-to-Video Model Adaptation Paper • 2309.16429 • Published Sep 28, 2023 • 11
Removing Bias in Multi-modal Classifiers: Regularization by Maximizing Functional Entropies Paper • 2010.10802 • Published Oct 21, 2020
Speech Emotion Recognition using Self-Supervised Features Paper • 2202.03896 • Published Feb 7, 2022
Perceptual Score: What Data Modalities Does Your Model Perceive? Paper • 2110.14375 • Published Oct 27, 2021
Are VQA Systems RAD? Measuring Robustness to Augmented Data with Focused Interventions Paper • 2106.04484 • Published Jun 8, 2021
Augmentation Invariant Discrete Representation for Generative Spoken Language Modeling Paper • 2209.15483 • Published Sep 30, 2022