6 58 13

Xiaoye Qu

Xiaoye08

AI & ML interests

None yet

Recent Activity

upvoted a paper 2 days ago

PyVision: Agentic Vision with Dynamic Tooling

upvoted a paper 2 days ago

Scaling RL to Long Videos

upvoted a paper 3 days ago

Perception-Aware Policy Optimization for Multimodal Reasoning

View all activity

Organizations

upvoted 2 papers 2 days ago

PyVision: Agentic Vision with Dynamic Tooling

Paper • 2507.07998 • Published 3 days ago • 26

Scaling RL to Long Videos

Paper • 2507.07966 • Published 3 days ago • 109

upvoted a paper 3 days ago

Perception-Aware Policy Optimization for Multimodal Reasoning

Paper • 2507.06448 • Published 4 days ago • 41

upvoted a paper 5 days ago

Pre-Trained Policy Discriminators are General Reward Models

Paper • 2507.05197 • Published 6 days ago • 33

authored a paper 9 days ago

Thinking with Images for Multimodal Reasoning: Foundations, Methods, and Future Frontiers

Paper • 2506.23918 • Published 13 days ago • 76

upvoted 2 papers 9 days ago

IntFold: A Controllable Foundation Model for General and Specialized Biomolecular Structure Prediction

Paper • 2507.02025 • Published 11 days ago • 35

Thinking with Images for Multimodal Reasoning: Foundations, Methods, and Future Frontiers

Paper • 2506.23918 • Published 13 days ago • 76

upvoted a paper 11 days ago

DiffuCoder: Understanding and Improving Masked Diffusion Models for Code Generation

Paper • 2506.20639 • Published 18 days ago • 26

upvoted a collection 18 days ago

Revisual-R1

Collection

🚀ReVisual-R1 is a 7B open-source multimodal language model that follows a three-stage curriculum—cold-start pre-training, multimodal reinforcement le • 6 items • Updated 6 days ago • 3

upvoted a paper 25 days ago

LongLLaDA: Unlocking Long Context Capabilities in Diffusion LLMs

Paper • 2506.14429 • Published 26 days ago • 44

upvoted a paper 26 days ago

MiniMax-M1: Scaling Test-Time Compute Efficiently with Lightning Attention

Paper • 2506.13585 • Published 27 days ago • 252

upvoted a paper 30 days ago

SWE-Factory: Your Automated Factory for Issue Resolution Training Data and Evaluation Benchmarks

Paper • 2506.10954 • Published about 1 month ago • 51

upvoted 4 papers about 1 month ago

Video World Models with Long-term Spatial Memory

Paper • 2506.05284 • Published Jun 5 • 53

VideoREPA: Learning Physics for Video Generation through Relational Alignment with Foundation Models

Paper • 2505.23656 • Published May 29 • 24

Unfolding Spatial Cognition: Evaluating Multimodal Models on Visual Simulations

Paper • 2506.04633 • Published Jun 5 • 18

Qwen3 Embedding: Advancing Text Embedding and Reranking Through Foundation Models

Paper • 2506.05176 • Published Jun 5 • 62

liked 2 models about 1 month ago

csfufu/Revisual-R1-Coldstart

Image-Text-to-Text • 8B • Updated 18 days ago • 1.31k • 5

csfufu/Revisual-R1-final

Image-Text-to-Text • 8B • Updated Jun 5 • 1.39k • 6

commented a paper about 1 month ago

Advancing Multimodal Reasoning: From Optimized Cold Start to Staged Reinforcement Learning

Paper • 2506.04207 • Published Jun 4 • 46 •

upvoted a paper about 1 month ago

Advancing Multimodal Reasoning: From Optimized Cold Start to Staged Reinforcement Learning

Paper • 2506.04207 • Published Jun 4 • 46

Xiaoye Qu

AI & ML interests

Recent Activity

Organizations

Xiaoye08's activity