Qian Liu's picture

Qian Liu

SivilTaram

·

http://siviltaram.github.io/

AI & ML interests

Cooking cool things

Recent Activity

authored a paper 2 days ago

SWE-Perf: Can Language Models Optimize Code Performance on Real-World Repositories?

upvoted a paper 2 days ago

SWE-Perf: Can Language Models Optimize Code Performance on Real-World Repositories?

commented on a paper 2 days ago

SWE-Perf: Can Language Models Optimize Code Performance on Real-World Repositories?

View all activity

Organizations

upvoted a paper 2 days ago

SWE-Perf: Can Language Models Optimize Code Performance on Real-World Repositories?

Paper • 2507.12415 • Published 2 days ago • 28

upvoted a paper 8 days ago

Afterburner: Reinforcement Learning Facilitates Self-Improving Code Efficiency Optimization

Paper • 2505.23387 • Published May 29 • 9

upvoted a paper 9 days ago

First Return, Entropy-Eliciting Explore

Paper • 2507.07017 • Published 9 days ago • 23

upvoted an article 10 days ago

Article

SmolLM3: smol, multilingual, long-context reasoner

By

and 22 others •

11 days ago

• 554

upvoted 3 papers 10 days ago

Agent KB: Leveraging Cross-Domain Experience for Agentic Problem Solving

Paper • 2507.06229 • Published 10 days ago • 67

CriticLean: Critic-Guided Reinforcement Learning for Mathematical Formalization

Paper • 2507.06181 • Published 10 days ago • 39

A Survey on Latent Reasoning

Paper • 2507.06203 • Published 10 days ago • 82

upvoted a paper 15 days ago

ZeCO: Zero Communication Overhead Sequence Parallelism for Linear Attention

Paper • 2507.01004 • Published 17 days ago • 10

upvoted a paper 18 days ago

SPIRAL: Self-Play on Zero-Sum Games Incentivizes Reasoning via Multi-Agent Multi-Turn Reinforcement Learning

Paper • 2506.24119 • Published 18 days ago • 44

upvoted a paper 21 days ago

OctoThinker: Mid-training Incentivizes Reinforcement Learning Scaling

Paper • 2506.20512 • Published 24 days ago • 46

upvoted a paper 22 days ago

MMSearch-R1: Incentivizing LMMs to Search

Paper • 2506.20670 • Published 23 days ago • 60

upvoted 2 collections about 1 month ago

MiniMax-M1

MiniMax-M1, the world's first open-weight, large-scale hybrid-attention reasoning model. • 6 items • Updated 16 days ago • 108

AceReason

Math and Code reasoning model trained through reinforcement learning (RL) • 7 items • Updated 8 days ago • 13

upvoted 2 papers about 1 month ago

TaskCraft: Automated Generation of Agentic Tasks

Paper • 2506.10055 • Published Jun 11 • 32

MiniMax-M1: Scaling Test-Time Compute Efficiently with Lightning Attention

Paper • 2506.13585 • Published Jun 16 • 254

upvoted an article about 1 month ago

Article

GRPO for GUI Grounding Done Right

By

•

Jun 11

• 30

upvoted a collection about 1 month ago

Qwen3

72 items • Updated Jun 15 • 861

upvoted 3 papers about 2 months ago

Beyond the 80/20 Rule: High-Entropy Minority Tokens Drive Effective Reinforcement Learning for LLM Reasoning

Paper • 2506.01939 • Published Jun 2 • 170

ProRL: Prolonged Reinforcement Learning Expands Reasoning Boundaries in Large Language Models

Paper • 2505.24864 • Published May 30 • 133

AceReason-Nemotron: Advancing Math and Code Reasoning through Reinforcement Learning

Paper • 2505.16400 • Published May 22 • 33