1 88 26

Maozhou Ge

Gmc2

GHGmc2

AI & ML interests

None yet

Recent Activity

liked a model 11 days ago

deepseek-ai/DeepSeek-R1-0528

upvoted a paper 23 days ago

Insights into DeepSeek-V3: Scaling Challenges and Reflections on Hardware for AI Architectures

upvoted an article 2 months ago

Vision Language Models Explained

View all activity

Organizations

None yet

Gmc2's activity

liked a model 11 days ago

deepseek-ai/DeepSeek-R1-0528

Text Generation • Updated 11 days ago • 93.6k • • 1.86k

upvoted a paper 23 days ago

Insights into DeepSeek-V3: Scaling Challenges and Reflections on Hardware for AI Architectures

Paper • 2505.09343 • Published 26 days ago • 64

upvoted an article 2 months ago

Article

Vision Language Models Explained

and 1 other •

Apr 11, 2024

• 375

liked 2 datasets 2 months ago

hiyouga/geometry3k

Viewer • Updated Apr 14 • 3k • 8.38k • 28

Dahoas/full-hh-rlhf

Viewer • Updated Feb 23, 2023 • 125k • 1.5k • 82

liked 2 models 3 months ago

Qwen/Qwen2.5-VL-32B-Instruct

Image-Text-to-Text • Updated Apr 14 • 507k • • 381

deepseek-ai/DeepSeek-V3-0324

Text Generation • Updated Mar 27 • 428k • • 2.96k

upvoted a collection 3 months ago

🌾Oat-Zero: Understanding R1-Zero-Like Training

Collection

5 items • Updated Apr 10 • 7

upvoted a paper 3 months ago

Qwen2.5-VL Technical Report

Paper • 2502.13923 • Published Feb 19 • 191

upvoted 2 articles 3 months ago

Article

How 🤗 Accelerate runs very large models thanks to PyTorch

•

Sep 27, 2022

• 12

Article

Open R1: Update #3

and 9 others •

Mar 11

• 292

liked a model 3 months ago

Qwen/QwQ-32B

Text Generation • Updated Mar 11 • 298k • • 2.77k

liked a Space 4 months ago

2.67k

The Ultra-Scale Playbook

🌌

The ultimate guide to training LLM on large GPU Clusters

upvoted 3 articles 4 months ago

Article

Open R1: Update #2

and 6 others •

Feb 10

• 214

Article

Open-source DeepResearch – Freeing our search agents

and 4 others •

Feb 4

• 1.26k

Article

Open-R1: a fully open reproduction of DeepSeek-R1

and 2 others •

Jan 28

• 865

liked a model 5 months ago

deepseek-ai/DeepSeek-R1

Text Generation • Updated Mar 27 • 669k • • 12.3k

upvoted a paper 5 months ago

DeepSeek-V3 Technical Report

Paper • 2412.19437 • Published Dec 27, 2024 • 64

liked a model 5 months ago

facebook/multi-token-prediction

Updated Jun 18, 2024 • 369

upvoted a paper 6 months ago

Marco-o1: Towards Open Reasoning Models for Open-Ended Solutions

Paper • 2411.14405 • Published Nov 21, 2024 • 62