Non applicable's picture

184

Non applicable

CombinHorizon

·

AI & ML interests

language models, speech to text

Organizations

None yet

New activity in CombinHorizon/zetasepic-abliteratedV2-Qwen2.5-32B-Inst-BaseMerge-TIES 5 months ago

Invalid LLM Leaderboard results

#1 opened 7 months ago by

New activity in open-llm-leaderboard/open_llm_leaderboard 5 months ago

failed models, check logs

#1062 opened 8 months ago by

New activity in openlifescienceai/open_medical_llm_leaderboard 7 months ago

Model evaluation and submission stuck of LB.

#17 opened about 1 year ago by

New activity in byroneverson/Yi-1.5-34B-Chat-abliterated 7 months ago

Unable submit this model to the LLM leaderboard (tokenizers issue)

#3 opened 7 months ago by

New activity in open-llm-leaderboard/open_llm_leaderboard 7 months ago

14B model detected as 7B

#1049 opened 8 months ago by

Resubmitting a model to use `chat_template` doesn't re-evaluate, but does change `chat_template` column

#1066 opened 8 months ago by

New activity in BAAI/open_cn_llm_leaderboard 8 months ago

Update leaderboard so can support newer models? (Can't submit newer Qwen2.5, gemma, phi models)

#3 opened about 1 year ago by

New activity in open-llm-leaderboard/open_llm_leaderboard 8 months ago

Feature Request: change request file format to disambiguate chat and non-chat models?

#954 opened 11 months ago by

Suggestion: Adding outlier-resistant averaging methods

#968 opened 11 months ago by

New activity in upstage/open-ko-llm-leaderboard 8 months ago

[Important Notice] Evaluation of Submitted Models

#89 opened about 1 year ago by

New activity in CombinHorizon/huihui-ai-abliterated-Qwen2.5-32B-Inst-BaseMerge-TIES 9 months ago

Adding Evaluation Results

#1 opened 9 months ago by

leaderboard-pr-bot

New activity in AtlaAI/judge-arena 9 months ago

Which models do you want to see on here?

#2 opened 9 months ago by

New activity in BAAI/open_cn_llm_leaderboard 9 months ago

Repeated failures of various running models

#6 opened about 1 year ago by

New activity in WildVision/vision-arena 10 months ago

"NETWORK ERROR DUE TO HIGH TRAFFIC. PLEASE REGENERATE OR REFRESH THIS PAGE." - votes should be ignored in this case

#13 opened about 1 year ago by

New activity in CombinHorizon/Rombos-Qwen2.5-7B-Inst-BaseMerge-TIES 10 months ago

Adding Evaluation Results

#1 opened 10 months ago by

New activity in BAAI/open_cn_llm_leaderboard 10 months ago

What is the current status for the leaderboard? (H-CLCC, and any recent results?)

#7 opened 12 months ago by

New activity in open-llm-leaderboard/open_llm_leaderboard 11 months ago

Eval time vs. score diagram

#950 opened 11 months ago by

Normalization for MMLU-Pro doesn't make sense

#947 opened 11 months ago by