wang
David0702
·
AI & ML interests
None yet
Recent Activity
upvoted
an
article
17 days ago
Open-R1: a fully open reproduction of DeepSeek-R1
upvoted
an
article
about 1 month ago
From Zero to Reasoning Hero: How DeepSeek-R1 Leverages Reinforcement Learning to Master Complex Reasoning
updated
a model
10 months ago
David0702/dqn-SpaceInvadersNoFrameskip-v4
Organizations
None yet
Collections
1
models
6
David0702/dqn-SpaceInvadersNoFrameskip-v4
Reinforcement Learning
•
Updated
•
4
David0702/Taxi-v3
Reinforcement Learning
•
Updated
David0702/q-FrozenLake-v1-4x4-noSlippery
Reinforcement Learning
•
Updated
David0702/ppo-Huggy
Reinforcement Learning
•
Updated
•
40
David0702/ppo-LunarLander-v2
Reinforcement Learning
•
Updated
•
2
David0702/ppo-LunarLander-v2-1
Reinforcement Learning
•
Updated
•
1
datasets
None public yet