Haoning Wu, Teo PRO

teowu

https://teowu.github.io

AI & ML interests

Lead of Q-Future: https://github.com/Q-Future. I love MLLMs/LMMs/LVLMs/(any names you call them). Part of two great MoE VLMs as core contributors: Kimi-VL & Aria. Living and Cooking in Singapore Now.

Recent Activity

reacted to fdaudens's post with 👍 3 days ago

You might not have heard of Moonshot AI — but within 24 hours, their new model Kimi K2 shot to the top of Hugging Face’s trending leaderboard. So… who are they, and why does it matter? Had a lot of fun co-writing this blog post with @xianbao, with key insights translated from Chinese, to unpack how this startup built a model that outperforms GPT-4.1, Claude Opus, and DeepSeek V3 on several major benchmarks. 🧵 A few standout facts: 1. From zero to $3.3B in 18 months: Founded in March 2023, Moonshot is now backed by Alibaba, Tencent, Meituan, and HongShan. 2. A CEO who thinks from the end: Yang Zhilin (31) previously worked at Meta AI, Google Brain, and Carnegie Mellon. His vision? Nothing less than AGI — still a rare ambition among Chinese AI labs. 3. A trillion-parameter model that’s surprisingly efficient: Kimi K2 uses a mixture-of-experts architecture (32B active params per inference) and dominates on coding/math benchmarks. 4. The secret weapon: Muon optimizer: A new training method that doubles efficiency, cuts memory in half, and ran 15.5T tokens with zero failures. Big implications. Most importantly, their move from closed to open source signals a broader shift in China’s AI scene — following Baidu’s pivot. But as Yang puts it: “Users are the only real leaderboard.” 👇 Check out the full post to explore what Kimi K2 can do, how to try it, and why it matters for the future of open-source LLMs: https://huggingface.co/blog/fdaudens/moonshot-ai-kimi-k2-explained

reacted to fdaudens's post with 🔥 3 days ago

liked a Space 5 days ago

akhaliq/anycoder

View all activity

Organizations

upvoted a collection 27 days ago

Kimi-VL-A3B

Collection

Moonshot's efficient MoE VLMs, exceptional on agent, long-context, and thinking • 7 items • Updated 18 days ago • 72

upvoted an article 27 days ago

Article

🤔👀🎬🖥️📖 Kimi-VL-A3B-Thinking-2506: A Quick Navigation

and 1 other •

27 days ago

• 63

upvoted 2 papers about 1 month ago

ProBench: Judging Multimodal Foundation Models on Open-ended Multi-domain Expert Tasks

Paper • 2503.06885 • Published Mar 10 • 4

MiMo-VL Technical Report

Paper • 2506.03569 • Published Jun 4 • 75

upvoted a paper about 2 months ago

VideoReasonBench: Can MLLMs Perform Vision-Centric Complex Video Reasoning?

Paper • 2505.23359 • Published May 29 • 40

upvoted a collection 3 months ago

Kimi-VL Thinking

Collection

3 items • Updated Apr 17 • 1

upvoted 3 papers 3 months ago

VL-Rethinker: Incentivizing Self-Reflection of Vision-Language Models with Reinforcement Learning

Paper • 2504.08837 • Published Apr 10 • 43

InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models

Paper • 2504.10479 • Published Apr 14 • 277

Kimi-VL Technical Report

Paper • 2504.07491 • Published Apr 10 • 131

upvoted 4 papers 5 months ago

SigLIP 2: Multilingual Vision-Language Encoders with Improved Semantic Understanding, Localization, and Dense Features

Paper • 2502.14786 • Published Feb 20 • 146

Qwen2.5-VL Technical Report

Paper • 2502.13923 • Published Feb 19 • 197

VideoAutoArena: An Automated Arena for Evaluating Large Multimodal Models in Video Analysis through User Simulation

Paper • 2411.13281 • Published Nov 20, 2024 • 22

Redundancy Principles for MLLMs Benchmarks

Paper • 2501.13953 • Published Jan 20 • 30

upvoted 2 papers 6 months ago

DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

Paper • 2501.12948 • Published Jan 22 • 409

Kimi k1.5: Scaling Reinforcement Learning with LLMs

Paper • 2501.12599 • Published Jan 22 • 121

upvoted a paper 7 months ago

MAmmoTH-VL: Eliciting Multimodal Reasoning with Instruction Tuning at Scale

Paper • 2412.05237 • Published Dec 6, 2024 • 48

upvoted 3 papers 8 months ago

VISTA: Enhancing Long-Duration and High-Resolution Video Understanding by Video Spatiotemporal Augmentation

Paper • 2412.00927 • Published Dec 1, 2024 • 28

Data Engineering for Scaling Language Models to 128K Context

Paper • 2402.10171 • Published Feb 15, 2024 • 26

AuroraCap: Efficient, Performant Video Detailed Captioning and a New Benchmark

Paper • 2410.03051 • Published Oct 4, 2024 • 6

upvoted a paper 9 months ago

Aria: An Open Multimodal Native Mixture-of-Experts Model

Paper • 2410.05993 • Published Oct 8, 2024 • 112

Haoning Wu, Teo PRO

AI & ML interests

Recent Activity

Organizations

teowu's activity

🤔👀🎬🖥️📖 Kimi-VL-A3B-Thinking-2506: A Quick Navigation