3 86 135

Wenhao Chai

wchai

http://rese1f.github.io

AI & ML interests

computer vision, artificial intelligence

Recent Activity

upvoted a paper 9 days ago

Large Language Models for Data Synthesis

upvoted a paper 12 days ago

Muddit: Liberating Generation Beyond Text-to-Image with a Unified Discrete Diffusion Model

liked a dataset 12 days ago

primerL/real_world_sample

View all activity

Organizations

wchai's activity

upvoted a paper 9 days ago

Large Language Models for Data Synthesis

Paper • 2505.14752 • Published 22 days ago • 48

upvoted a paper 12 days ago

Muddit: Liberating Generation Beyond Text-to-Image with a Unified Discrete Diffusion Model

Paper • 2505.23606 • Published 12 days ago • 14

upvoted 2 papers 14 days ago

G1: Bootstrapping Perception and Reasoning Abilities of Vision-Language Model via Reinforcement Learning

Paper • 2505.13426 • Published 22 days ago • 12

Memory-Efficient Visual Autoregressive Modeling with Scale-Aware KV Cache Compression

Paper • 2505.19602 • Published 16 days ago • 13

upvoted a paper 20 days ago

MMaDA: Multimodal Large Diffusion Language Models

Paper • 2505.15809 • Published 20 days ago • 87

upvoted a paper 30 days ago

Flow-GRPO: Training Flow Matching Models via Online RL

Paper • 2505.05470 • Published May 8 • 78

upvoted 3 papers about 1 month ago

Practical Efficiency of Muon for Pretraining

Paper • 2505.02222 • Published May 4 • 37

TEMPURA: Temporal Event Masked Prediction and Understanding for Reasoning in Action

Paper • 2505.01583 • Published May 2 • 9

Science-T2I: Addressing Scientific Illusions in Image Synthesis

Paper • 2504.13129 • Published Apr 17 • 3

upvoted 4 papers about 2 months ago

Step1X-Edit: A Practical Framework for General Image Editing

Paper • 2504.17761 • Published Apr 24 • 88

It's All Connected: A Journey Through Test-Time Memorization, Attentional Bias, Retention, and Online Optimization

Paper • 2504.13173 • Published Apr 17 • 19

WORLDMEM: Long-term Consistent World Simulation with Memory

Paper • 2504.12369 • Published Apr 16 • 34

GigaTok: Scaling Visual Tokenizers to 3 Billion Parameters for Autoregressive Image Generation

Paper • 2504.08736 • Published Apr 11 • 47

upvoted 3 papers 2 months ago

upvoted a collection 2 months ago

Kimi-VL-A3B

Collection

Moonshot's efficient MoE VLMs, exceptional on agent, long-context, and thinking • 6 items • Updated Apr 12 • 65

upvoted 3 papers 2 months ago

OmniSVG: A Unified Scalable Vector Graphics Generation Model

Paper • 2504.06263 • Published Apr 8 • 168

Less-to-More Generalization: Unlocking More Controllability by In-Context Generation

Paper • 2504.02160 • Published Apr 2 • 37

An Empirical Study of GPT-4o Image Generation Capabilities

Paper • 2504.05979 • Published Apr 8 • 63