10 44 40

Yuhang Zang

yuhangzang

https://yuhangzang.github.io/

AI & ML interests

🤗 HuggingFace is all you need

Recent Activity

authored a paper 4 days ago

ScaleCap: Inference-Time Scalable Image Captioning via Dual-Modality Debiasing

upvoted a paper 4 days ago

Skywork-SWE: Unveiling Data Scaling Laws for Software Engineering in LLMs

liked a Space 4 days ago

nanotron/ultrascale-playbook

View all activity

Organizations

upvoted 2 papers 4 days ago

Skywork-SWE: Unveiling Data Scaling Laws for Software Engineering in LLMs

Paper • 2506.19290 • Published 5 days ago • 47

ScaleCap: Inference-Time Scalable Image Captioning via Dual-Modality Debiasing

Paper • 2506.19848 • Published 5 days ago • 24

upvoted a paper 11 days ago

LongLLaDA: Unlocking Long Context Capabilities in Diffusion LLMs

Paper • 2506.14429 • Published 12 days ago • 43

upvoted a paper 19 days ago

Reinforcement Pre-Training

Paper • 2506.08007 • Published 20 days ago • 234

upvoted a paper 23 days ago

Video World Models with Long-term Spatial Memory

Paper • 2506.05284 • Published 24 days ago • 53

upvoted a collection about 1 month ago

VideoRoPE: What Makes for Good Video Rotary Position Embeddi

Collection

A storage repo for VideoRoPE. • 6 items • Updated 12 days ago • 3

upvoted an article about 1 month ago

Article

Introducing Pivotal Token Search (PTS): Targeting Critical Decision Points in LLM Training

•

May 17

• 5

upvoted 3 papers about 1 month ago

Visual Agentic Reinforcement Fine-Tuning

Paper • 2505.14246 • Published May 20 • 31

Self-play with Execution Feedback: Improving Instruction-following Capabilities of Large Language Models

Paper • 2406.13542 • Published Jun 19, 2024 • 17

WorldPM: Scaling Human Preference Modeling

Paper • 2505.10527 • Published May 15 • 33

upvoted a collection 2 months ago

MM-IFEngine

Collection

Datasets, Benchmark and Checkpoints for MM-IFEngine • 2 items • Updated Apr 26 • 5

upvoted 4 papers 3 months ago

Exploring Data Scaling Trends and Effects in Reinforcement Learning from Human Feedback

Paper • 2503.22230 • Published Mar 28 • 44

upvoted 4 papers 4 months ago

Unified Reward Model for Multimodal Understanding and Generation

Paper • 2503.05236 • Published Mar 7 • 124

Visual-RFT: Visual Reinforcement Fine-Tuning

Paper • 2503.01785 • Published Mar 3 • 80

OmniAlign-V: Towards Enhanced Alignment of MLLMs with Human Preference

Paper • 2502.18411 • Published Feb 25 • 73

Thus Spake Long-Context Large Language Model

Paper • 2502.17129 • Published Feb 24 • 73

upvoted an article 4 months ago

Article

Navigating the RLHF Landscape: From Policy Gradients to PPO, GAE, and DPO for LLM Alignment

•

Feb 11

• 47