Taming LLMs by Scaling Learning Rates with Gradient Grouping Paper • 2506.01049 • Published 10 days ago • 36 • 4
Taming LLMs by Scaling Learning Rates with Gradient Grouping Paper • 2506.01049 • Published 10 days ago • 36 • 4
MergeVQ: A Unified Framework for Visual Generation and Representation with Disentangled Token Merging and Quantization Paper • 2504.00999 • Published Apr 1 • 93 • 7