CLaSp: In-Context Layer Skip for Self-Speculative Decoding Paper • 2505.24196 • Published 12 days ago • 13
Learning Dynamics in Continual Pre-Training for Large Language Models Paper • 2505.07796 • Published 29 days ago • 19
DeepSeek-Coder: When the Large Language Model Meets Programming -- The Rise of Code Intelligence Paper • 2401.14196 • Published Jan 25, 2024 • 63