kenhktsui/Qwen-0.5B-GRPO-gsm8k-count-wait-cap-cross-correct Text Generation • Updated 14 days ago • 13
kenhktsui/Qwen-0.5B-GRPO-gsm8k-count-wait-cap-cross-correct Text Generation • Updated 14 days ago • 13
ZebraLogic: On the Scaling Limits of LLMs for Logical Reasoning Paper • 2502.01100 • Published 19 days ago • 15