view article Article SmolLM3: smol, multilingual, long-context reasoner By loubnabnl and 22 others • 18 days ago • 578
FineWeb2: One Pipeline to Scale Them All -- Adapting Pre-Training Data Processing to Every Language Paper • 2506.20920 • Published 30 days ago • 63
Evaluation is All You Need: Strategic Overclaiming of LLM Reasoning Capabilities Through Evaluation Design Paper • 2506.04734 • Published Jun 5 • 19
Krikri: Advancing Open Large Language Models for Greek Paper • 2505.13772 • Published May 19 • 4
Krikri 8B Collection Advanced Open LLM for Greek based on Llama-3.1 8B • 4 items • Updated Jun 2 • 3
Meltemi 7B Collection First Open LLM for Greek based on Mistral 7B • 6 items • Updated May 16 • 2
ILSP Greek Evaluation Suite Collection A collection of test sets for evaluating base and chat LLMs (incl. VLMs) on Greek generation and understanding capabilities • 15 items • Updated Jun 18 • 3
reWordBench: Benchmarking and Improving the Robustness of Reward Models with Transformed Inputs Paper • 2503.11751 • Published Mar 14 • 16
view article Article Welcome Gemma 3: Google's all new multimodal, multilingual, long context open LLM By ariG23498 and 3 others • Mar 12 • 447
Can LLMs Predict Citation Intent? An Experimental Analysis of In-context Learning and Fine-tuning on Open LLMs Paper • 2502.14561 • Published Feb 20 • 2
SWE-RL: Advancing LLM Reasoning via Reinforcement Learning on Open Software Evolution Paper • 2502.18449 • Published Feb 25 • 74
Weighted-Reward Preference Optimization for Implicit Model Fusion Paper • 2412.03187 • Published Dec 4, 2024 • 12
Stronger Models are NOT Stronger Teachers for Instruction Tuning Paper • 2411.07133 • Published Nov 11, 2024 • 39
view article Article Releasing the largest multilingual open pretraining dataset By Pclanglais and 2 others • Nov 13, 2024 • 102
EuroLLM: Multilingual Language Models for Europe Paper • 2409.16235 • Published Sep 24, 2024 • 27
view article Article 🔥 Argilla 2.0: the data-centric tool for AI makers 🤗 By dvilasuero • Jul 30, 2024 • 38
Meltemi: The first open Large Language Model for Greek Paper • 2407.20743 • Published Jul 30, 2024 • 69