Chengsong Huang's picture

Chengsong Huang

ChengsongHuang

·

https://chengsong-huang.github.io/

hcscctv

AI & ML interests

None yet

Recent Activity

upvoted a paper 2 days ago

Scaling RL to Long Videos

updated a dataset 3 days ago

HINT-lab/qwen3_frequent_solver_v1_v2

published a dataset 3 days ago

HINT-lab/qwen3_frequent_solver_v1_v2

View all activity

Organizations

upvoted a paper 2 days ago

Scaling RL to Long Videos

Paper • 2507.07966 • Published 3 days ago • 114

upvoted a paper 3 days ago

Towards Solving More Challenging IMO Problems via Decoupled Reasoning and Proving

Paper • 2507.06804 • Published 6 days ago • 14

upvoted a paper 12 days ago

Scaling Speculative Decoding with Lookahead Reasoning

Paper • 2506.19830 • Published 19 days ago • 12

upvoted a collection 18 days ago

Self-Calibration

Efficient Test-Time Scaling via Self-Calibration https://arxiv.org/abs/2503.00031 • 7 items • Updated Jun 8 • 2

upvoted a paper 19 days ago

Drag-and-Drop LLMs: Zero-Shot Prompt-to-Weights

Paper • 2506.16406 • Published 24 days ago • 119

upvoted a paper 23 days ago

Router-R1: Teaching LLMs Multi-Round Routing and Aggregation via Reinforcement Learning

Paper • 2506.09033 • Published Jun 10 • 7

upvoted 4 papers about 1 month ago

Confidence Is All You Need: Few-Shot RL Fine-Tuning of Language Models

Paper • 2506.06395 • Published Jun 5 • 127

Soft Thinking: Unlocking the Reasoning Potential of LLMs in Continuous Concept Space

Paper • 2505.15778 • Published May 21 • 17

POSS: Position Specialist Generates Better Draft for Speculative Decoding

Paper • 2506.03566 • Published Jun 4 • 6

Paper2Poster: Towards Multimodal Poster Automation from Scientific Papers

Paper • 2505.21497 • Published May 27 • 105

upvoted a paper about 2 months ago

WebAgent-R1: Training Web Agents via End-to-End Multi-Turn Reinforcement Learning

Paper • 2505.16421 • Published May 22 • 19

upvoted 9 papers 3 months ago

Generative AI Act II: Test Time Scaling Drives Cognition Engineering

Paper • 2504.13828 • Published Apr 18 • 17

InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models

Paper • 2504.10479 • Published Apr 14 • 276

Advances and Challenges in Foundation Agents: From Brain-Inspired Intelligence to Evolutionary, Collaborative, and Safe Systems

Paper • 2504.01990 • Published Mar 31 • 295

OLMoTrace: Tracing Language Model Outputs Back to Trillions of Training Tokens

Paper • 2504.07096 • Published Apr 9 • 74

Optimizing Language Model's Reasoning Abilities with Weak Supervision

Paper • 2405.04086 • Published May 7, 2024 • 2

Taming Overconfidence in LLMs: Reward Calibration in RLHF

Paper • 2410.09724 • Published Oct 13, 2024 • 3

On Grounded Planning for Embodied Tasks with Language Models

Paper • 2209.00465 • Published Aug 29, 2022 • 1

Divide, Reweight, and Conquer: A Logit Arithmetic Approach for In-Context Learning

Paper • 2410.10074 • Published Oct 14, 2024 • 1

CrossWordBench: Evaluating the Reasoning Capabilities of LLMs and LVLMs with Controllable Puzzle Generation

Paper • 2504.00043 • Published Mar 30 • 9