Can 1B LLM Surpass 405B LLM? Rethinking Compute-Optimal Test-Time Scaling Paper • 2502.06703 • Published Feb 10 • 143
Parameters vs FLOPs: Scaling Laws for Optimal Sparsity for Mixture-of-Experts Language Models Paper • 2501.12370 • Published Jan 21 • 11
Probing-RAG: Self-Probing to Guide Language Models in Selective Document Retrieval Paper • 2410.13339 • Published Oct 17, 2024
Gorilla: Large Language Model Connected with Massive APIs Paper • 2305.15334 • Published May 24, 2023 • 5
PERL: Parameter Efficient Reinforcement Learning from Human Feedback Paper • 2403.10704 • Published Mar 15, 2024 • 58
GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection Paper • 2403.03507 • Published Mar 6, 2024 • 188
MoAI: Mixture of All Intelligence for Large Language and Vision Models Paper • 2403.07508 • Published Mar 12, 2024 • 76
Probing Out-of-Distribution Robustness of Language Models with Parameter-Efficient Transfer Learning Paper • 2301.11660 • Published Jan 27, 2023 • 1
From RAGs to rich parameters: Probing how language models utilize external knowledge over parametric information for factual queries Paper • 2406.12824 • Published Jun 18, 2024 • 21
DeepSeek-VL2: Mixture-of-Experts Vision-Language Models for Advanced Multimodal Understanding Paper • 2412.10302 • Published Dec 13, 2024 • 17
LLM Post-Training: A Deep Dive into Reasoning Large Language Models Paper • 2502.21321 • Published 20 days ago
Multimodal Chain-of-Thought Reasoning: A Comprehensive Survey Paper • 2503.12605 • Published 4 days ago • 25
R1-VL: Learning to Reason with Multimodal Large Language Models via Step-wise Group Relative Policy Optimization Paper • 2503.12937 • Published 3 days ago • 23
Beyond Reward Hacking: Causal Rewards for Large Language Model Alignment Paper • 2501.09620 • Published Jan 16