InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models Paper • 2504.10479 • Published Apr 14 • 280
StrandHead: Text to Strand-Disentangled 3D Head Avatars Using Hair Geometric Priors Paper • 2412.11586 • Published Dec 16, 2024 • 11
SynerGen-VL: Towards Synergistic Image Understanding and Generation with Vision Experts and Token Folding Paper • 2412.09604 • Published Dec 12, 2024 • 39
Pushing Auto-regressive Models for 3D Shape Generation at Capacity and Scalability Paper • 2402.12225 • Published Feb 19, 2024 • 9
VL-GPT: A Generative Pre-trained Transformer for Vision and Language Understanding and Generation Paper • 2312.09251 • Published Dec 14, 2023 • 10
ML-Bench: Large Language Models Leverage Open-source Libraries for Machine Learning Tasks Paper • 2311.09835 • Published Nov 16, 2023 • 11