Post
2489
I tested Muon vs MuonClip vs Muon+AdamW for fine-tuning LLMs
Just published a blog on that, Read here π https://huggingface.co/blog/KingNish/optimizer-part1
Just published a blog on that, Read here π https://huggingface.co/blog/KingNish/optimizer-part1