Submitted by MiniMax-AI 91 MiniMax-Speech: Intrinsic Zero-Shot Text-to-Speech with a Learnable Speaker Encoder · 20 authors 3
Submitted by ZacharyNovack 7 Fast Text-to-Audio Generation with Adversarial Post-Training · 11 authors 2
Submitted by Junjie-Ye 7 A Multi-Dimensional Constraint Framework for Evaluating and Improving Instruction Following in Large Language Models · 15 authors 2
Submitted by akhaliq 6 Aya Vision: Advancing the Frontier of Multilingual Multimodality · 25 authors 2
Submitted by akhaliq 5 AM-Thinking-v1: Advancing the Frontier of Reasoning at 32B Scale · 8 authors 2
Submitted by jinghan23 5 Bring Reason to Vision: Understanding Perception and Reasoning through Model Merging · 8 authors 2
Submitted by Omartificial-Intelligence-Space 4 Advancing Arabic Reverse Dictionary Systems: A Transformer-Based Approach with Dataset Construction Guidelines · 7 authors 2
Submitted by EdBianchi 3 SkillFormer: Unified Multi-View Video Understanding for Proficiency Estimation · 2 authors 2
Submitted by taiwang 1 NavDP: Learning Sim-to-Real Navigation Diffusion Policy with Privileged Information Guidance · 9 authors 2
Submitted by Omartificial-Intelligence-Space 1 Optimizing Retrieval-Augmented Generation: Analysis of Hyperparameter Impact on Performance and Efficiency · 4 authors 2
Submitted by trucnguyen28 1 ViMRHP: A Vietnamese Benchmark Dataset for Multimodal Review Helpfulness Prediction via Human-AI Collaborative Annotation · 4 authors 2
Submitted by onekq - Tests as Prompt: A Test-Driven-Development Benchmark for LLM Code Generation · 1 authors 1