Marc Sun's picture

Marc Sun

marcsun13

·

AI & ML interests

LLM, Quantization, Training, Inference

Recent Activity

upvoted an article about 16 hours ago

Fine-tuning Llama 2 70B using PyTorch FSDP

liked a model 6 days ago

deepseek-ai/DeepSeek-R1-0528

reacted to sayakpaul's post with 🚀 8 days ago

Diffusers supports a good variety of quantization backends. It can be challenging to navigate through them, given the complex nature of diffusion pipelines in general. So, @derekl35 set out to write a comprehensive guide that puts users in the front seat. Explore the different backends we support, learn the trade-offs they offer, and finally, check out the cool space we built that lets you compare quantization results. Give it a go here: https://lnkd.in/gf8Pi4-2

View all activity

Organizations

marcsun13's activity

upvoted an article about 16 hours ago

Article

Fine-tuning Llama 2 70B using PyTorch FSDP

By

and 3 others •

Sep 13, 2023

• 24

upvoted a collection 14 days ago

Flux quantized checkpoints

This collection regroups quantized flux checkpoints that we used in this blogpost: https://huggingface.co/blog/diffusers-quantization • 5 items • Updated 14 days ago • 1

upvoted 2 articles 14 days ago

Article

The Transformers Library: standardizing model definitions

By

and 3 others •

20 days ago

• 109

Article

Exploring Quantization Backends in Diffusers

By

and 2 others •

14 days ago

• 31

upvoted a collection 16 days ago

EXL3 models

22 items • Updated about 17 hours ago • 25

upvoted 2 articles about 1 month ago

Article

Introducing AutoRound: Intel’s Advanced Quantization for LLMs and VLMs

By

and 8 others •

Apr 29

• 32

Article

🔥 Announcing FLUX-Juiced: The Fastest Image Generation Endpoint (2.6 times faster)!

By

and 3 others •

Apr 23

• 9

upvoted a collection about 2 months ago

Gemma 3 QAT

Quantization Aware Trained (QAT) Gemma 3 checkpoints. The model preserves similar quality as half precision while using 3x less memory • 15 items • Updated 5 days ago • 195

upvoted 2 articles about 2 months ago

Article

Memory-efficient Diffusion Transformers with Quanto and Diffusers

By

and 1 other •

Jul 30, 2024

• 66

Article

Welcome Llama 4 Maverick & Scout on Hugging Face!

By

and 6 others •

Apr 5

• 144

upvoted a collection about 2 months ago

Llama 4

Llama 4 release • 13 items • Updated Apr 29 • 521

upvoted 3 articles 3 months ago

Article

NVIDIA's GTC 2025 Announcement for Physical AI Developers: New Open Models and Datasets

By

and 4 others •

Mar 18

• 41

Article

Welcome Gemma 3: Google's all new multimodal, multilingual, long context open LLM

By

and 3 others •

Mar 12

• 424

Article

LLM Inference on Edge: A Fun and Easy Guide to run LLMs via React Native on your Phone!

By

and 1 other •

Mar 7

• 60

upvoted a paper 6 months ago

LoftQ: LoRA-Fine-Tuning-Aware Quantization for Large Language Models

Paper • 2310.08659 • Published Oct 12, 2023 • 28

upvoted an article 8 months ago

Article

Fixing Gradient Accumulation

By

and 5 others •

Oct 16, 2024

• 53

upvoted 3 articles 9 months ago

Article

Fine-tuning LLMs to 1.58bit: extreme quantization made easy

By

and 5 others •

Sep 18, 2024

• 246

Article

Accelerate 1.0.0

By

and 2 others •

Sep 13, 2024

• 52

Article

SmolLM - blazingly fast and remarkably powerful

By

and 2 others •

Jul 16, 2024

• 374

upvoted an article 10 months ago

Article

XetHub is joining Hugging Face!

By

and 1 other •

Aug 8, 2024

• 95