HanningZhang/Qwen-7B-grpo-plusplus-nocliphigher-sample1n8-sample8-filter1.0-insufficient0.0-a0.001-b2.0-iter4 Text Generation • Updated about 8 hours ago
HanningZhang/Qwen-7B-grpo-plusplus-nocliphigher-sample1n8-sample8-filter1.0-insufficient0.0-a0.001-b2.0-iter4 Text Generation • Updated about 8 hours ago
HanningZhang/Qwen-7B-grpo-plusplus-nocliphigher-sample1n8-sample8-filter1.0-insufficient0.0-a0.001-b2.0-iter3 Text Generation • Updated about 10 hours ago
HanningZhang/Qwen-7B-grpo-plusplus-nocliphigher-sample1n8-sample8-filter1.0-insufficient0.0-a0.001-b2.0-iter3 Text Generation • Updated about 10 hours ago
HanningZhang/Qwen-7B-grpo-plusplus-nocliphigher-sample1n8-sample8-filter1.0-insufficient0.0-a0.001-b2.0-iter2 Text Generation • Updated about 11 hours ago
HanningZhang/Qwen-7B-grpo-plusplus-nocliphigher-sample1n8-sample8-filter1.0-insufficient0.0-a0.001-b2.0-iter2 Text Generation • Updated about 11 hours ago
HanningZhang/Qwen-7B-grpo-plusplus-nocliphigher-sample1n8-sample8-filter1.0-insufficient0.0-a0.001-b2.0-iter1 Text Generation • Updated about 12 hours ago
HanningZhang/Qwen-7B-grpo-plusplus-nocliphigher-sample1n8-sample8-filter1.0-insufficient0.0-a0.001-b2.0-iter1 Text Generation • Updated about 12 hours ago
CLIMB: CLustering-based Iterative Data Mixture Bootstrapping for Language Model Pre-training Paper • 2504.13161 • Published 2 days ago • 75
HanningZhang/Qwen2.5-Math-7B-grpo-plusplus_em-sample1n8-sample8-filter1.0-insufficient0.0-a0.001-b2.0-iter4 Text Generation • Updated 1 day ago • 4
HanningZhang/Qwen2.5-Math-7B-grpo-plusplus_em-sample1n8-sample8-filter1.0-insufficient0.0-a0.001-b2.0-iter4 Text Generation • Updated 1 day ago • 4
HanningZhang/Qwen2.5-Math-7B-grpo-plusplus_em-sample1n8-sample8-filter1.0-insufficient0.0-a0.001-b2.0-iter3 Text Generation • Updated 1 day ago • 4
HanningZhang/Qwen2.5-Math-7B-grpo-plusplus_em-sample1n8-sample8-filter1.0-insufficient0.0-a0.001-b2.0-iter3 Text Generation • Updated 1 day ago • 4
HanningZhang/Qwen2.5-Math-7B-grpo-plusplus_em-sample1n8-sample8-filter1.0-insufficient0.0-a0.001-b2.0-iter2 Text Generation • Updated 1 day ago • 5
HanningZhang/Qwen2.5-Math-7B-grpo-plusplus_em-sample1n8-sample8-filter1.0-insufficient0.0-a0.001-b2.0-iter2 Text Generation • Updated 1 day ago • 5
HanningZhang/Qwen2.5-Math-7B-grpo-plusplus_em-sample1n8-sample8-filter1.0-insufficient0.0-a0.001-b2.0-iter1 Text Generation • Updated 1 day ago • 5
HanningZhang/Qwen2.5-Math-7B-grpo-plusplus_em-sample1n8-sample8-filter1.0-insufficient0.0-a0.001-b2.0-iter1 Text Generation • Updated 1 day ago • 5
HanningZhang/Qwen2.5-Math-7B-raft-plusplus_em-sample1n8-sample8-filter1.0-insufficient0.0-a0.001-b2.0-iter8 Text Generation • Updated 2 days ago • 9
HanningZhang/Qwen2.5-Math-7B-raft-plusplus_em-sample1n8-sample8-filter1.0-insufficient0.0-a0.001-b2.0-iter8 Text Generation • Updated 2 days ago • 9
HanningZhang/Qwen2.5-Math-7B-raft-plusplus_em-sample1n8-sample8-filter1.0-insufficient0.0-a0.001-b2.0-iter7 Text Generation • Updated 2 days ago • 8