Scaling Laws of Decoder-Only Models on the Multilingual Machine Translation Task Paper β’ 2409.15051 β’ Published Sep 23, 2024 β’ 2
Babel: Open Multilingual Large Language Models Serving Over 90% of Global Speakers Paper β’ 2503.00865 β’ Published Mar 2 β’ 65
Optimizing Large Language Model Training Using FP4 Quantization Paper β’ 2501.17116 β’ Published Jan 28 β’ 38
OmniDocBench: Benchmarking Diverse PDF Document Parsing with Comprehensive Annotations Paper β’ 2412.07626 β’ Published Dec 10, 2024 β’ 23
PaliGemma 2: A Family of Versatile VLMs for Transfer Paper β’ 2412.03555 β’ Published Dec 4, 2024 β’ 134
view article Article πΊπ¦ββ¬ LLM Comparison/Test: 25 SOTA LLMs (including QwQ) through 59 MMLU-Pro CS benchmark runs By wolfram β’ Dec 4, 2024 β’ 79
X-ALMA: Plug & Play Modules and Adaptive Rejection for Quality Translation at Scale Paper β’ 2410.03115 β’ Published Oct 4, 2024 β’ 1
Is Preference Alignment Always the Best Option to Enhance LLM-Based Translation? An Empirical Analysis Paper β’ 2409.20059 β’ Published Sep 30, 2024 β’ 16
InfiMM-WebMath-40B: Advancing Multimodal Pre-Training for Enhanced Mathematical Reasoning Paper β’ 2409.12568 β’ Published Sep 19, 2024 β’ 51
Training Language Models to Self-Correct via Reinforcement Learning Paper β’ 2409.12917 β’ Published Sep 19, 2024 β’ 139
MoRA: High-Rank Updating for Parameter-Efficient Fine-Tuning Paper β’ 2405.12130 β’ Published May 20, 2024 β’ 51
How Far Are We to GPT-4V? Closing the Gap to Commercial Multimodal Models with Open-Source Suites Paper β’ 2404.16821 β’ Published Apr 25, 2024 β’ 58