Generalized Few-shot 3D Point Cloud Segmentation with Vision-Language Model Paper • 2503.16282 • Published 4 days ago • 5
TaoAvatar: Real-Time Lifelike Full-Body Talking Avatars for Augmented Reality via 3D Gaussian Splatting Paper • 2503.17032 • Published 3 days ago • 9
Single Image Iterative Subject-driven Generation and Editing Paper • 2503.16025 • Published 4 days ago • 8
ETVA: Evaluation of Text-to-Video Alignment via Fine-grained Question Generation and Answering Paper • 2503.16867 • Published 4 days ago • 8
When Less is Enough: Adaptive Token Reduction for Efficient Image Representation Paper • 2503.16660 • Published 4 days ago • 50
From Head to Tail: Towards Balanced Representation in Large Vision-Language Models through Adaptive Data Calibration Paper • 2503.12821 • Published 8 days ago • 6
MathFlow: Enhancing the Perceptual Flow of MLLMs for Visual Mathematical Problems Paper • 2503.16549 • Published 5 days ago • 9
When Preferences Diverge: Aligning Diffusion Models with Minority-Aware Adaptive DPO Paper • 2503.16921 • Published 3 days ago • 5
FastCuRL: Curriculum Reinforcement Learning with Progressive Context Extension for Efficient Training R1-like Reasoning Models Paper • 2503.17287 • Published 3 days ago • 7
Modifying Large Language Model Post-Training for Diverse Creative Writing Paper • 2503.17126 • Published 3 days ago • 15
OpenVLThinker: An Early Exploration to Complex Vision-Language Reasoning via Iterative Self-Improvement Paper • 2503.17352 • Published 3 days ago • 18
Bridging Continuous and Discrete Tokens for Autoregressive Visual Generation Paper • 2503.16430 • Published 4 days ago • 24
MARS: A Multi-Agent Framework Incorporating Socratic Guidance for Automated Prompt Optimization Paper • 2503.16874 • Published 3 days ago • 37
RoboFactory: Exploring Embodied Agent Collaboration with Compositional Constraints Paper • 2503.16408 • Published 4 days ago • 31
MAPS: A Multi-Agent Framework Based on Big Seven Personality and Socratic Guidance for Multimodal Scientific Problem Solving Paper • 2503.16905 • Published 3 days ago • 44
LEGION: Learning to Ground and Explain for Synthetic Image Detection Paper • 2503.15264 • Published 5 days ago • 18
CURIE: Evaluating LLMs On Multitask Scientific Long Context Understanding and Reasoning Paper • 2503.13517 • Published 10 days ago • 4