view article Article DABStep: Data Agent Benchmark for Multi-step Reasoning By eggie5 and 5 others • Feb 4 • 67
view article Article Open-R1: a fully open reproduction of DeepSeek-R1 By eliebak and 2 others • Jan 28 • 822
view article Article LeMaterial: an open source initiative to accelerate materials discovery and research By AlexDuvalinho and 9 others • Dec 10, 2024 • 43
view article Article CinePile 2.0 - making stronger datasets with adversarial refinement By mfarre and 3 others • Oct 23, 2024 • 15
view article Article Fine-tuning LLMs to 1.58bit: extreme quantization made easy By medmekk and 5 others • Sep 18, 2024 • 227
view article Article A failed experiment: Infini-Attention, and why we should keep trying? By neuralink and 2 others • Aug 14, 2024 • 60
view article Article Llama 3.1 - 405B, 70B & 8B with multilinguality and long context By philschmid and 7 others • Jul 23, 2024 • 231
view article Article BigCodeBench: Benchmarking Large Language Models on Solving Practical and Challenging Programming Tasks By terryyz and 8 others • Jun 18, 2024 • 46
view article Article StarCoder2-Instruct: Fully Transparent and Permissive Self-Alignment for Code Generation By yuxiang630 and 8 others • Apr 29, 2024 • 76
view article Article Welcome Llama 3 - Meta's new open LLM By philschmid and 4 others • Apr 18, 2024 • 286
view article Article Preference Tuning LLMs with Direct Preference Optimization Methods By kashif and 4 others • Jan 18, 2024 • 50
view article Article Welcome Mixtral - a SOTA Mixture of Experts on Hugging Face By lewtun and 6 others • Dec 11, 2023 • 12
view article Article The N Implementation Details of RLHF with PPO By vwxyzjn and 2 others • Oct 24, 2023 • 45
view article Article Finetune Stable Diffusion Models with DDPO via TRL By metric-space and 3 others • Sep 29, 2023 • 12
view article Article Spread Your Wings: Falcon 180B is here By philschmid and 4 others • Sep 6, 2023 • 5