Marrying Autoregressive Transformer and Diffusion with Multi-Reference Autoregression Paper • 2506.09482 • Published Jun 11 • 46
BLIP3-o: A Family of Fully Open Unified Multimodal Models-Architecture, Training and Dataset Paper • 2505.09568 • Published May 14 • 97
T2I-R1: Reinforcing Image Generation with Collaborative Semantic-level and Token-level CoT Paper • 2505.00703 • Published May 1 • 45