Quan Wei
quanwei0
AI & ML interests
None yet
Organizations
None yet
models
17
quanwei0/nq-search-r1-ppo-qwen2.5-7b-em-gae-mixed-reward-new7
Updated
quanwei0/nq-search-r1-ppo-qwen2.5-7b-em-gae
Updated
quanwei0/hotpotqa-search-r1-ppo-qwen2.5-7b-em-gae-mixed-reward-new7
Updated
quanwei0/hotpotqa-search-r1-ppo-qwen2.5-7b-em-gae
Updated
quanwei0/nq-hotpotqa-search-r1-ppo-qwen2.5-7b-em-gae-mixed-reward-new7-maxturn4
Updated
quanwei0/nq-hotpotqa-search-r1-ppo-qwen2.5-7b-em-gae-maxturn4
Updated
quanwei0/mt_grpo-aae-coef-1.0-4-outcome-reward-2-turn-reward-max-steps-400-qwen2.5-7b
Updated
quanwei0/mt_grpo-aae-coef-1.0-4-outcome-reward-2-turn-reward-max-steps-300-qwen2.5-7b-1
Updated
quanwei0/grpo-4-outcome-reward-no-turn-reward-max-steps-300-qwen2.5-7b-1
Updated
quanwei0/grpo-4-outcome-reward-2-turn-reward-max-steps-300-qwen2.5-7b-1
Updated