SuperGPQA: Scaling LLM Evaluation across 285 Graduate Disciplines Paper β’ 2502.14739 β’ Published about 23 hours ago β’ 76
SuperGPQA: Scaling LLM Evaluation across 285 Graduate Disciplines Paper β’ 2502.14739 β’ Published about 23 hours ago β’ 76
SuperGPQA: Scaling LLM Evaluation across 285 Graduate Disciplines Paper β’ 2502.14739 β’ Published about 23 hours ago β’ 76
Distillation Quantification for Large Language Models Paper β’ 2501.12619 β’ Published about 1 month ago
Sailor2: Sailing in South-East Asia with Inclusive Multilingual LLMs Paper β’ 2502.12982 β’ Published 3 days ago β’ 9
SimpleVQA: Multimodal Factuality Evaluation for Multimodal Large Language Models Paper β’ 2502.13059 β’ Published 3 days ago
SuperGPQA: Scaling LLM Evaluation across 285 Graduate Disciplines Paper β’ 2502.14739 β’ Published about 23 hours ago β’ 76
SuperGPQA: Scaling LLM Evaluation across 285 Graduate Disciplines Paper β’ 2502.14739 β’ Published about 23 hours ago β’ 76
SuperGPQA: Scaling LLM Evaluation across 285 Graduate Disciplines Paper β’ 2502.14739 β’ Published about 23 hours ago β’ 76
SuperGPQA: Scaling LLM Evaluation across 285 Graduate Disciplines Paper β’ 2502.14739 β’ Published about 23 hours ago β’ 76
Small Models Struggle to Learn from Strong Reasoners Paper β’ 2502.12143 β’ Published 4 days ago β’ 21
Steel-LLM:From Scratch to Open Source -- A Personal Journey in Building a Chinese-Centric LLM Paper β’ 2502.06635 β’ Published 11 days ago β’ 4
Generating Symbolic World Models via Test-time Scaling of Large Language Models Paper β’ 2502.04728 β’ Published 14 days ago β’ 17