view post Post 2309 I've run the open llm leaderboard evaluations + hellaswag on deepseek-ai/DeepSeek-R1-Distill-Llama-8B and compared to meta-llama/Llama-3.1-8B-Instruct and at first glance R1 do not beat Llama overall.If anyone wants to double check the results are posted here: https://github.com/csabakecskemeti/lm_eval_resultsAm I made some mistake, or (at least this distilled version) not as good/better than the competition?I'll run the same on the Qwen 7B distilled version too. See translation 7 replies Β· π 6 6 + Reply
Visual Language Models Collection Collection of OpenVINO optimized models for visual-language assistance β’ 9 items β’ Updated 25 days ago β’ 3
view article Article πΊπ¦ββ¬ LLM Comparison/Test: 25 SOTA LLMs (including QwQ) through 59 MMLU-Pro CS benchmark runs By wolfram β’ Dec 4, 2024 β’ 77
view post Post 1759 great blogpost! π₯@wolfram https://huggingface.co/blog/wolfram/llm-comparison-test-2024-12-04 See translation π₯ 4 4 π 1 1 + Reply
kaitchup/Qwen2.5-72B-Instruct-AutoRound-GPTQ-4bit Text Generation β’ Updated Nov 26, 2024 β’ 20 β’ 6