9 12 1

Bo Liu

Benjamin-eecs

https://benjamin-eecs.github.io/

AI & ML interests

Reinforcement Learning, Reasoning, Machine Learning Systems

Recent Activity

upvoted a paper 16 days ago

Bootstrapping Task Spaces for Self-Improvement

authored a paper 23 days ago

LLaVA-Critic-R1: Your Critic Model is Secretly a Strong Policy Model

upvoted a collection 23 days ago

LLaVA-Critic-R1

View all activity

Organizations

upvoted a paper 16 days ago

Bootstrapping Task Spaces for Self-Improvement

Paper • 2509.04575 • Published 21 days ago • 5

authored a paper 23 days ago

LLaVA-Critic-R1: Your Critic Model is Secretly a Strong Policy Model

Paper • 2509.00676 • Published 26 days ago • 83

upvoted a collection 23 days ago

LLaVA-Critic-R1

Collection

6 items • Updated 22 days ago • 2

upvoted a paper 23 days ago

LLaVA-Critic-R1: Your Critic Model is Secretly a Strong Policy Model

Paper • 2509.00676 • Published 26 days ago • 83

upvoted a paper 3 months ago

The Automated LLM Speedrunning Benchmark: Reproducing NanoGPT Improvements

Paper • 2506.22419 • Published Jun 27 • 14

New activity in spiral-rl/Spiral-Kuhn-Poker-Qwen3-32B-SFT 3 months ago

feat(enhance dataset card): add metadata, expanded intro, and sample usage

#2 opened 3 months ago by

nielsr

New activity in spiral-rl/Spiral-Qwen3-4B 3 months ago

feat(improve model card): add pipeline tag, library name, quickstart, and expanded details

#1 opened 3 months ago by

nielsr

New activity in spiral-rl/Spiral-DeepSeek-R1-Distill-Qwen-7B 3 months ago

feat: add pipeline tag, library name, and sample usage

#1 opened 3 months ago by

nielsr

updated a dataset 3 months ago

spiral-rl/Spiral-Kuhn-Poker-Qwen3-32B-SFT

Viewer • Updated Jul 5 • 25.5k • 21

updated 2 models 3 months ago

spiral-rl/Spiral-DeepSeek-R1-Distill-Qwen-7B

Text Generation • 8B • Updated Jul 5 • 10 • 2

spiral-rl/Spiral-Qwen3-4B

Text Generation • 4B • Updated Jul 5 • 62 • 4

authored a paper 3 months ago

SPIRAL: Self-Play on Zero-Sum Games Incentivizes Reasoning via Multi-Agent Multi-Turn Reinforcement Learning

Paper • 2506.24119 • Published Jun 30 • 50

published a dataset 3 months ago

spiral-rl/Spiral-Kuhn-Poker-Qwen3-32B-SFT

Viewer • Updated Jul 5 • 25.5k • 21

published 2 models 3 months ago

spiral-rl/Spiral-DeepSeek-R1-Distill-Qwen-7B

Text Generation • 8B • Updated Jul 5 • 10 • 2

spiral-rl/Spiral-Qwen3-4B

Text Generation • 4B • Updated Jul 5 • 62 • 4

updated a collection 3 months ago

SPIRAL

Collection

4 items • Updated Jul 1 • 2

upvoted a paper 3 months ago

SPIRAL: Self-Play on Zero-Sum Games Incentivizes Reasoning via Multi-Agent Multi-Turn Reinforcement Learning

Paper • 2506.24119 • Published Jun 30 • 50

commented a paper 3 months ago

SPIRAL: Self-Play on Zero-Sum Games Incentivizes Reasoning via Multi-Agent Multi-Turn Reinforcement Learning

Paper • 2506.24119 • Published Jun 30 • 50 •

updated a collection 3 months ago

SPIRAL

Collection

4 items • Updated Jul 1 • 2

Bo Liu

AI & ML interests

Recent Activity

Organizations

Benjamin-eecs's activity

feat(enhance dataset card): add metadata, expanded intro, and sample usage

feat(improve model card): add pipeline tag, library name, quickstart, and expanded details

feat: add pipeline tag, library name, and sample usage