RL4Reasoning

community

Activity Feed

AI & ML interests

None defined yet.

Recent Activity

yuzhen17 authored a paper 14 days ago

Pitfalls of Rule- and Model-based Verifiers -- A Case Study on Mathematical Reasoning

Junteng authored a paper 15 days ago

SynLogic: Synthesizing Verifiable Reasoning Data at Scale for Learning Logical Reasoning and Beyond

Junteng authored a paper 15 days ago

On the Perception Bottleneck of VLMs for Chart Understanding

View all activity

RL4Reasoning's activity

yuzhen17

authored a paper 14 days ago

Pitfalls of Rule- and Model-based Verifiers -- A Case Study on Mathematical Reasoning

Paper • 2505.22203 • Published 15 days ago • 6

Junteng

authored 2 papers 15 days ago

SynLogic: Synthesizing Verifiable Reasoning Data at Scale for Learning Logical Reasoning and Beyond

Paper • 2505.19641 • Published 17 days ago • 64

On the Perception Bottleneck of VLMs for Chart Understanding

Paper • 2503.18435 • Published Mar 24 • 1

yuzhen17

authored a paper 15 days ago

Learn to Reason Efficiently with Adaptive Length-based Reward Shaping

Paper • 2505.15612 • Published 22 days ago • 32

PeterV09

authored 4 papers 21 days ago

Diving into Self-Evolving Training for Multimodal Reasoning

Paper • 2412.17451 • Published Dec 23, 2024 • 44

SimpleRL-Zoo: Investigating and Taming Zero Reinforcement Learning for Open Base Models in the Wild

Paper • 2503.18892 • Published Mar 24 • 31

Bring Reason to Vision: Understanding Perception and Reasoning through Model Merging

Paper • 2505.05464 • Published May 8 • 10

Learn to Reason Efficiently with Adaptive Length-based Reward Shaping

Paper • 2505.15612 • Published 22 days ago • 32

PeterV09

updated a model 29 days ago

hkust-nlp/Laser-DE-L4096-7B

Updated 29 days ago • 31

PeterV09

published a model 29 days ago

hkust-nlp/Laser-DE-L4096-7B

Updated 29 days ago • 31

PeterV09

updated a model 29 days ago

hkust-nlp/Laser-D-L4096-7B

Updated 29 days ago • 20

PeterV09

published a model 29 days ago

hkust-nlp/Laser-D-L4096-7B

Updated 29 days ago • 20

PeterV09

published a model about 2 months ago

RL4Reasoning/verl-grpo-lr-deepscaler-bsz128-16384-8192-rtl-cliphigh-hf-1.5B-2_deepscaler_-390

Updated Apr 22

PeterV09

updated a model about 2 months ago

RL4Reasoning/verl-grpo-lr-deepscaler-bsz128-16384-rtl-dynamic-m-e-l4096-cliphigh-hf-1.5B-4_deepscaler_-220

Updated Apr 22 • 8

PeterV09

published a model about 2 months ago

RL4Reasoning/verl-grpo-lr-deepscaler-bsz128-16384-rtl-dynamic-m-e-l4096-cliphigh-hf-1.5B-4_deepscaler_-220

Updated Apr 22 • 8

PeterV09

updated a model about 2 months ago

RL4Reasoning/verl-grpo-lr-deepscaler-bsz128-16384-rtl-dynamic-m-e-cliphigh-hf-1.5B-4_deepscaler_-390

Updated Apr 22 • 9

PeterV09

published a model about 2 months ago

RL4Reasoning/verl-grpo-lr-deepscaler-bsz128-16384-rtl-dynamic-m-e-cliphigh-hf-1.5B-4_deepscaler_-390

Updated Apr 22 • 9

PeterV09

updated a model about 2 months ago

RL4Reasoning/verl-grpo-lr-deepscaler-bsz128-16384-2048-rtl-cliphigh-hf-1.5B-4_deepscaler_-340

Updated Apr 22 • 8

PeterV09

published a model about 2 months ago

RL4Reasoning/verl-grpo-lr-deepscaler-bsz128-16384-2048-rtl-cliphigh-hf-1.5B-4_deepscaler_-340

Updated Apr 22 • 8

PeterV09

updated a model about 2 months ago

RL4Reasoning/verl-grpo-lr-deepscaler-bsz128-16384-4096-rtl-cliphigh-hf-1.5B-4_deepscaler_-140

Updated Apr 22 • 8

AI & ML interests

Recent Activity

Team members 3

RL4Reasoning's activity