Boxi Yu's picture

2 2

Boxi Yu

Bertsekas

·

https://boxiyu.github.io/

AI & ML interests

Coding Agent, Automated Operator

Recent Activity

authored a paper 3 days ago

How Should I Build A Benchmark? Revisiting Code-Related Benchmarks For LLMs

authored a paper 3 days ago

UTBoost: Rigorous Evaluation of Coding Agents on SWE-Bench

upvoted a paper 4 days ago

UTBoost: Rigorous Evaluation of Coding Agents on SWE-Bench

View all activity

Organizations

None yet

Bertsekas 's models

None public yet