Yinxu Pan

cppowboy

https://github.com/Cppowboy

AI & ML interests

RL for LLM, Code&Math Reasoning, Function Calling, Code Interpreter, Vision-Language Pretraining

Recent Activity

upvoted a paper about 19 hours ago

SkillClaw: Let Skills Evolve Collectively with Agentic Evolver

upvoted a paper about 19 hours ago

ClawBench: Can AI Agents Complete Everyday Online Tasks?

liked a model 4 days ago

openbmb/VoxCPM2

View all activity

Organizations

upvoted 2 papers about 19 hours ago

SkillClaw: Let Skills Evolve Collectively with Agentic Evolver

Paper • 2604.08377 • Published 3 days ago • 221

ClawBench: Can AI Agents Complete Everyday Online Tasks?

Paper • 2604.08523 • Published 3 days ago • 167

liked a model 4 days ago

openbmb/VoxCPM2

Text-to-Speech • Updated 4 days ago • 7.45k • 710

liked a dataset 10 days ago

zai-org/CC-Bench-trajectories

Viewer • Updated Sep 30, 2025 • 260 • 708 • 94

upvoted 2 papers 11 days ago

FIPO: Eliciting Deep Reasoning with Future-KL Influenced Policy Optimization

Paper • 2603.19835 • Published 23 days ago • 330

LongCat-Next: Lexicalizing Modalities as Discrete Tokens

Paper • 2603.27538 • Published 14 days ago • 137

upvoted a paper 16 days ago

SlopCodeBench: Benchmarking How Coding Agents Degrade Over Long-Horizon Iterative Tasks

Paper • 2603.24755 • Published 18 days ago • 28

liked a dataset 18 days ago

mercor/APEX-SWE

Updated 19 days ago • 4.64k • 23

liked a dataset 19 days ago

mercor/apex-agents

Viewer • Updated Mar 3 • 480 • 46.6k • 109

upvoted a paper 19 days ago

LongCat-Flash-Prover: Advancing Native Formal Reasoning via Agentic Tool-Integrated Reinforcement Learning

Paper • 2603.21065 • Published 21 days ago • 77

New activity in Qwen/Qwen3.5-397B-A17B 20 days ago

Can not reproduce evaluation results on SWE-Verified

#63 opened about 1 month ago by

cppowboy

upvoted a paper 23 days ago

MetaClaw: Just Talk -- An Agent That Meta-Learns and Evolves in the Wild

Paper • 2603.17187 • Published 25 days ago • 137

upvoted 5 papers 24 days ago

Online Experiential Learning for Language Models

Paper • 2603.16856 • Published 26 days ago • 58

TRUST-SQL: Tool-Integrated Multi-Turn Reinforcement Learning for Text-to-SQL over Unknown Schemas

Paper • 2603.16448 • Published 26 days ago • 58

InCoder-32B: Code Foundation Model for Industrial Scenarios

Paper • 2603.16790 • Published 26 days ago • 307

MiroThinker-1.7 & H1: Towards Heavy-Duty Research Agents via Verification

Paper • 2603.15726 • Published 27 days ago • 185

SWE-Skills-Bench: Do Agent Skills Actually Help in Real-World Software Engineering?

Paper • 2603.15401 • Published 27 days ago • 19

New activity in GAIR/OpenSWE 25 days ago

Are these images publicly available?

#2 opened 25 days ago by

cppowboy

liked a dataset 27 days ago

GAIR/OpenSWE

Viewer • Updated 26 days ago • 45.3k • 2.43k • 16

upvoted a paper 27 days ago

daVinci-Env: Open SWE Environment Synthesis at Scale

Paper • 2603.13023 • Published 30 days ago • 30

Yinxu Pan

AI & ML interests

Recent Activity

Organizations

cppowboy's activity

Can not reproduce evaluation results on SWE-Verified

Are these images publicly available?