Shuai Liu's picture

25 7

Shuai Liu

Choiszt

·

https://github.com/choiszt

Choiszt

AI & ML interests

Embodied AI

Recent Activity

upvoted a paper about 1 month ago

MMSearch-R1: Incentivizing LMMs to Search

upvoted a paper 2 months ago

GoT-R1: Unleashing Reasoning Capability of MLLM for Visual Generation with Reinforcement Learning

upvoted a paper 2 months ago

Beyond 'Aha!': Toward Systematic Meta-Abilities Alignment in Large Reasoning Models

View all activity

Organizations

upvoted a paper about 1 month ago

MMSearch-R1: Incentivizing LMMs to Search

Paper • 2506.20670 • Published Jun 25 • 60

upvoted 2 papers 2 months ago

GoT-R1: Unleashing Reasoning Capability of MLLM for Visual Generation with Reinforcement Learning

Paper • 2505.17022 • Published May 22 • 27

Beyond 'Aha!': Toward Systematic Meta-Abilities Alignment in Large Reasoning Models

Paper • 2505.10554 • Published May 15 • 120

upvoted a paper 3 months ago

Even Small Reasoners Should Quote Their Sources: Introducing the Pleias-RAG Model Family

Paper • 2504.18225 • Published Apr 25 • 13

upvoted 2 papers 4 months ago

Kimi-VL Technical Report

Paper • 2504.07491 • Published Apr 10 • 133

JARVIS-VLA: Post-Training Large-Scale Vision Language Models to Play Visual Games with Keyboards and Mouse

Paper • 2503.16365 • Published Mar 20 • 41

upvoted 5 papers 5 months ago

GoT: Unleashing Reasoning Capability of Multimodal Large Language Model for Visual Generation and Editing

Paper • 2503.10639 • Published Mar 13 • 52

Gemini Embedding: Generalizable Embeddings from Gemini

Paper • 2503.07891 • Published Mar 10 • 42

EgoLife: Towards Egocentric Life Assistant

Paper • 2503.03803 • Published Mar 5 • 45

OmniAlign-V: Towards Enhanced Alignment of MLLMs with Human Preference

Paper • 2502.18411 • Published Feb 25 • 75

LongRoPE2: Near-Lossless LLM Context Window Scaling

Paper • 2502.20082 • Published Feb 27 • 39

upvoted 4 collections 5 months ago

EgoLife

CVPR 2025 - EgoLife: Towards Egocentric Life Assistant. Homepage: https://egolife-ai.github.io/ • 10 items • Updated Mar 7 • 19

Multimodal-SAE

The collection of the sae that hooked on llava • 5 items • Updated Mar 4 • 8

VideoMMMU

3 items • Updated Feb 11 • 2

LLaVA-Video

Models focus on video understanding (previously known as LLaVA-NeXT-Video). • 8 items • Updated Feb 21 • 62

upvoted a paper 7 months ago

Evaluation Agent: Efficient and Promptable Evaluation Framework for Visual Generative Models

Paper • 2412.09645 • Published Dec 10, 2024 • 37

upvoted a paper 8 months ago

Large Multi-modal Models Can Interpret Features in Large Multi-modal Models

Paper • 2411.14982 • Published Nov 22, 2024 • 19

upvoted 2 papers 9 months ago

MS MARCO Web Search: a Large-scale Information-rich Web Dataset with Millions of Real Click Labels

Paper • 2405.07526 • Published May 13, 2024 • 22

PUMA: Empowering Unified MLLM with Multi-granular Visual Generation

Paper • 2410.13861 • Published Oct 17, 2024 • 57

upvoted a collection 10 months ago

LLaVA-Critic

as a general evaluator for assessing model performance • 6 items • Updated Oct 6, 2024 • 10