wang's picture

3

wang

David0702

·

freebooterish8286

AI & ML interests

None yet

Recent Activity

upvoted an article 17 days ago

Open-R1: a fully open reproduction of DeepSeek-R1

upvoted an article about 1 month ago

From Zero to Reasoning Hero: How DeepSeek-R1 Leverages Reinforcement Learning to Master Complex Reasoning

updated a model 10 months ago

David0702/dqn-SpaceInvadersNoFrameskip-v4

View all activity

Organizations

None yet

David0702's activity

upvoted an article 17 days ago

Article

Open-R1: a fully open reproduction of DeepSeek-R1

Jan 28

• 810

upvoted an article about 1 month ago

Article

From Zero to Reasoning Hero: How DeepSeek-R1 Leverages Reinforcement Learning to Master Complex Reasoning

By

•

Feb 4

• 12

upvoted a collection 12 months ago

DBRX

DBRX is a mixture-of-experts (MoE) large language model trained from scratch by Databricks. • 3 items • Updated Mar 27, 2024 • 94