LIBERO-Plus: In-depth Robustness Analysis of Vision-Language-Action Models Paper • 2510.13626 • Published 1 day ago • 39
Implicit Reward as the Bridge: A Unified View of SFT and DPO Connections Paper • 2507.00018 • Published Jun 15
XY-Tokenizer: Mitigating the Semantic-Acoustic Conflict in Low-Bitrate Speech Codecs Paper • 2506.23325 • Published Jun 29
VStyle: A Benchmark for Voice Style Adaptation with Spoken Instructions Paper • 2509.09716 • Published Sep 9 • 10
MOSS-Speech: Towards True Speech-to-Speech Models Without Text Guidance Paper • 2510.00499 • Published 16 days ago • 18
Video-LMM Post-Training: A Deep Dive into Video Reasoning with Large Multimodal Models Paper • 2510.05034 • Published 11 days ago • 43
MOSS-Speech: Towards True Speech-to-Speech Models Without Text Guidance Paper • 2510.00499 • Published 16 days ago • 18
MOSS-Speech: Towards True Speech-to-Speech Models Without Text Guidance Paper • 2510.00499 • Published 16 days ago • 18 • 2