88 55 171

Yaowei Zheng

hiyouga

https://github.com/hiyouga

AI & ML interests

LLM Knowledge Management

Recent Activity

upvoted an article 6 days ago

🤔👀🎬🖥️📖 Kimi-VL-A3B-Thinking-2506: A Quick Navigation

liked a dataset 10 days ago

Saigyouji-Yuyuko1000/dapo17k

updated a Space 12 days ago

hiyouga/LLaMA-Board

View all activity

Organizations

upvoted an article 6 days ago

Article

🤔👀🎬🖥️📖 Kimi-VL-A3B-Thinking-2506: A Quick Navigation

and 1 other •

7 days ago

• 51

liked a dataset 10 days ago

Saigyouji-Yuyuko1000/dapo17k

Viewer • Updated 5 days ago • 17.9k • 118 • 1

updated a Space 12 days ago

204

LLaMA Board

🦙

Fine-tuning large language model with Gradio UI

liked 2 models 13 days ago

reducto/RolmOCR

Image-Text-to-Text • 8B • Updated Apr 2 • 128k • 441

nanonets/Nanonets-OCR-s

Image-Text-to-Text • 4B • Updated 8 days ago • 202k • 1.23k

New activity in hiyouga/rl-mixed-dataset 16 days ago

[bot] Conversion to Parquet

#1 opened 17 days ago by

parquet-converter

updated 2 datasets 17 days ago

hiyouga/journeybench-multi-image-vqa

Viewer • Updated Apr 14 • 313 • 268 • 1

hiyouga/rl-mixed-dataset

Viewer • Updated 17 days ago • 3.6k • 148

published a dataset 17 days ago

hiyouga/rl-mixed-dataset

Viewer • Updated 17 days ago • 3.6k • 148

liked a dataset 23 days ago

open-thoughts/OpenThoughts3-1.2M

Viewer • Updated 19 days ago • 1.2M • 21k • 113

liked a model 23 days ago

open-thoughts/OpenThinker3-7B

Text Generation • 8B • Updated 19 days ago • 14.9k • • 115

upvoted a paper 26 days ago

Beyond the 80/20 Rule: High-Entropy Minority Tokens Drive Effective Reinforcement Learning for LLM Reasoning

Paper • 2506.01939 • Published 26 days ago • 164

liked a model 30 days ago

ByteDance-Seed/BAGEL-7B-MoT

Any-to-Any • 15B • Updated 5 days ago • 7.55k • 1.07k

liked a Space about 1 month ago

436

AI Deadlines

⚡

Organize project deadlines with AI assistance

upvoted 2 papers about 1 month ago

Emerging Properties in Unified Multimodal Pretraining

Paper • 2505.14683 • Published May 20 • 130

AttentionInfluence: Adopting Attention Head Influence for Weak-to-Strong Pretraining Data Selection

Paper • 2505.07293 • Published May 12 • 26

liked a dataset about 1 month ago

ByteDance-Seed/mga-fineweb-edu

Viewer • Updated May 19 • 846M • 5.6k • 27

upvoted a paper about 1 month ago

Unified Multimodal Understanding and Generation Models: Advances, Challenges, and Opportunities

Paper • 2505.02567 • Published May 5 • 77

upvoted a paper about 2 months ago

Seed1.5-VL Technical Report

Paper • 2505.07062 • Published May 11 • 145

liked a model about 2 months ago

ByteDance-Seed/UI-TARS-1.5-7B

Image-Text-to-Text • 8B • Updated Apr 18 • 99.7k • 305

Yaowei Zheng

AI & ML interests

Recent Activity

Organizations

hiyouga's activity

🤔👀🎬🖥️📖 Kimi-VL-A3B-Thinking-2506: A Quick Navigation

LLaMA Board

[bot] Conversion to Parquet

AI Deadlines