Phi-4-Mini-Reasoning: Exploring the Limits of Small Reasoning Language Models in Math Paper • 2504.21233 • Published Apr 30 • 47
Phi-4-Mini Technical Report: Compact yet Powerful Multimodal Language Models via Mixture-of-LoRAs Paper • 2503.01743 • Published Mar 3 • 88
PEMA: An Offsite-Tunable Plug-in External Memory Adaptation for Language Models Paper • 2311.08590 • Published Nov 14, 2023
Scalable and Efficient MoE Training for Multitask Multilingual Models Paper • 2109.10465 • Published Sep 22, 2021
Mixture of Quantized Experts (MoQE): Complementary Effect of Low-bit Quantization and Robustness Paper • 2310.02410 • Published Oct 3, 2023 • 3
FineQuant: Unlocking Efficiency with Fine-Grained Weight-Only Quantization for LLMs Paper • 2308.09723 • Published Aug 16, 2023 • 2
AutoMoE: Heterogeneous Mixture-of-Experts with Adaptive Computation for Efficient Neural Machine Translation Paper • 2210.07535 • Published Oct 14, 2022 • 1
Taming Sparsely Activated Transformer with Stochastic Experts Paper • 2110.04260 • Published Oct 8, 2021 • 2
Contrastive Preference Optimization: Pushing the Boundaries of LLM Performance in Machine Translation Paper • 2401.08417 • Published Jan 16, 2024 • 37
A Paradigm Shift in Machine Translation: Boosting Translation Performance of Large Language Models Paper • 2309.11674 • Published Sep 20, 2023 • 32