Lee Park's picture

Lee Park

gogo8232

AI & ML interests

None yet

Recent Activity

upvoted a collection 19 days ago
Gemma 3 Release
upvoted a collection about 1 month ago
Qwen3
View all activity

Organizations

Hugging Face MCP Course's profile picture

gogo8232's activity

New activity in google/gemma-3-4b-it 2 months ago
upvoted an article 8 months ago
view article
Article

From DeepSpeed to FSDP and Back Again with Hugging Face Accelerate

By muellerzr and 3 others
54
upvoted an article 12 months ago
view article
Article

Our Transformers Code Agent beats the GAIA benchmark!

By m-ric and 1 other
88
reacted to yushun0410's post with 🔥 12 months ago
view post
Post
4645
Hi Huggingfacers!

Thrilled to introduce Adam-mini, an optimizer that achieves on-par or better performance than AdamW with 45% to 50% less memory footprint. Adam-mini can also achieve 49.5% higher throughput than AdamW on Llama2-7B pre-training.

The design of Adam-mini is inspired by certain Hessian structures we observed on Transformers.

Feel free to try it out! Try switching to Adam-mini with the same hyperparams of AdamW, it would work with only half memory. Hope Adam-mini can help save time, cost, and energy in your tasks!

Paper: "Adam-mini: Use Fewer Learning Rates To Gain More" https://arxiv.org/abs/2406.16793

Code: https://github.com/zyushun/Adam-mini

  • 1 reply
·
upvoted 2 articles 12 months ago
view article
Article

BM25 for Python: Achieving high performance while simplifying dependencies with *BM25S*⚡

By xhluca
55
upvoted an article 12 months ago
New activity in maywell/ko_youtube_transcription_sample about 1 year ago

1분미만

3
#2 opened about 1 year ago by
gogo8232