Quartet: Native FP4 Training Can Be Optimal for Large Language Models Paper • 2505.14669 • Published May 20 • 76
Pushing the Limits of Large Language Model Quantization via the Linearity Theorem Paper • 2411.17525 • Published Nov 26, 2024 • 5
Panza: A Personalized Text Writing Assistant via Data Playback and Local Fine-Tuning Paper • 2407.10994 • Published Jun 24, 2024 • 2
Extreme Compression of Large Language Models via Additive Quantization Paper • 2401.06118 • Published Jan 11, 2024 • 13