SVD-Free Low-Rank Adaptive Gradient Optimization for Large Language Models Paper • 2505.17967 • Published May 23 • 17
Quartet: Native FP4 Training Can Be Optimal for Large Language Models Paper • 2505.14669 • Published May 20 • 76
view article Article CircleGuardBench: New Standard for Evaluating AI Moderation Models By whitecircle-ai and 7 others • May 7 • 53
Feature-Level Insights into Artificial Text Detection with Sparse Autoencoders Paper • 2503.03601 • Published Mar 5 • 233
QuEST: Stable Training of LLMs with 1-Bit Weights and Activations Paper • 2502.05003 • Published Feb 7 • 44
Extreme Compression of Large Language Models via Additive Quantization Paper • 2401.06118 • Published Jan 11, 2024 • 13