2 45 16

Lewei Lu

luotto

ottolu

AI & ML interests

None yet

Recent Activity

liked a dataset 5 days ago

snorkelai/Multi-Turn-Insurance-Underwriting

liked a dataset 5 days ago

a-m-team/AM-DeepSeek-R1-0528-Distilled

liked a dataset 5 days ago

Tongyi-Zhiwen/DocQA-RL-1.6K

View all activity

Organizations

luotto's activity

upvoted a paper 5 days ago

Language-Image Alignment with Fixed Text Encoders

Paper • 2506.04209 • Published 6 days ago • 11

upvoted a paper 8 days ago

Visual Embodied Brain: Let Multimodal Large Language Models See, Think, and Control in Spaces

Paper • 2506.00123 • Published 11 days ago • 32

upvoted a paper 12 days ago

ZeroGUI: Automating Online GUI Learning at Zero Human Cost

Paper • 2505.23762 • Published 12 days ago • 45

upvoted a paper 16 days ago

Visual Planning: Let's Think Only with Images

Paper • 2505.11409 • Published 26 days ago • 55

upvoted a paper 17 days ago

Tool-Star: Empowering LLM-Brained Multi-Tool Reasoner via Reinforcement Learning

Paper • 2505.16410 • Published 20 days ago • 56

upvoted a collection 22 days ago

SigLIP2

Collection

36 items • Updated 12 days ago • 74

upvoted a paper 28 days ago

Seed1.5-VL Technical Report

Paper • 2505.07062 • Published about 1 month ago • 143

upvoted a collection about 1 month ago

Qwen3

Collection

40 items • Updated 21 days ago • 748

upvoted 2 papers about 2 months ago

VisuLogic: A Benchmark for Evaluating Visual Reasoning in Multi-modal Large Language Models

Paper • 2504.15279 • Published Apr 21 • 74

InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models

Paper • 2504.10479 • Published Apr 14 • 270

upvoted 3 papers 2 months ago

upvoted 2 papers 3 months ago

InternVL: Scaling up Vision Foundation Models and Aligning for Generic Visual-Linguistic Tasks

Paper • 2312.14238 • Published Dec 21, 2023 • 20

Dita: Scaling Diffusion Transformer for Generalist Vision-Language-Action Policy

Paper • 2503.19757 • Published Mar 25 • 50

upvoted a collection 3 months ago

InternLM3

Collection

6 items • Updated Feb 11 • 26

upvoted 4 papers 3 months ago

Creation-MMBench: Assessing Context-Aware Creative Intelligence in MLLM

Paper • 2503.14478 • Published Mar 18 • 48

Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention

Paper • 2502.11089 • Published Feb 16 • 160

VisualPRM: An Effective Process Reward Model for Multimodal Reasoning

Paper • 2503.10291 • Published Mar 13 • 36

GoT: Unleashing Reasoning Capability of Multimodal Large Language Model for Visual Generation and Editing

Paper • 2503.10639 • Published Mar 13 • 50