demystify-long-cot

community

AI & ML interests

None defined yet.

yuexiang96

authored a paper 3 months ago

ScholarCopilot: Training Large Language Models for Academic Writing with Accurate Citations

Paper • 2504.00824 • Published Apr 1 • 43

tongyx361

authored a paper 3 months ago

DAPO: An Open-Source LLM Reinforcement Learning System at Scale

Paper • 2503.14476 • Published Mar 18 • 132

tongyx361

published 5 datasets 3 months ago

demystify-long-cot/math-train-qwen-rs-n128

Viewer • Updated Jan 23 • 766k • 16

demystify-long-cot/math-train-qwen-rs-n64

Viewer • Updated Jan 23 • 383k • 10

demystify-long-cot/math-train-qwen-rs-n32

Viewer • Updated Jan 23 • 192k • 18

demystify-long-cot/math-train-qwq-rs-n128

Viewer • Updated Jan 21 • 854k • 7

demystify-long-cot/math-train-qwq-rs-n64

Viewer • Updated Jan 21 • 428k • 7

tongyx361

published 13 models 3 months ago

demystify-long-cot/llama-3.1-8b-webit231k-qwq-n2-raw-sft-ppo

8B • Updated Jan 20 • 13

demystify-long-cot/llama-3.1-8b-webit231k-qwq-n1-raw-sft-ppo

8B • Updated Jan 20 • 11

demystify-long-cot/llama-3.1-8b-webit462k-qwq-n8-rft

demystify-long-cot/llama-3.1-8b-webit462k-qwq-n4-rft

demystify-long-cot/llama-3.1-8b-webit462k-qwq-n2-rft

8B • Updated Jan 20 • 11

demystify-long-cot/llama-3.1-8b-webit231k-qwq-n8-rft

8B • Updated Jan 20 • 59

demystify-long-cot/llama-3.1-8b-webit231k-qwq-n4-rft

8B • Updated Jan 20 • 16

demystify-long-cot/llama-3.1-8b-webit462k-qwq-n1-raw-sft

8B • Updated Jan 20 • 12

demystify-long-cot/llama-3.1-8b-webit231k-qwq-n4-raw-sft

8B • Updated Jan 20 • 13

demystify-long-cot/llama-3.1-8b-webit231k-qwq-n2-raw-sft

8B • Updated Jan 20 • 10

demystify-long-cot/llama-3.1-8b-webit231k-qwq-n1-raw-sft

8B • Updated Jan 20 • 10

demystify-long-cot/llama-3.1-8b-math-qwen-n32-rft-ppo

8B • Updated Jan 20 • 11

demystify-long-cot/llama-3.1-8b-math-qwen-n32-rft

8B • Updated Jan 20 • 11