Open LLM Leaderboard Results PR Opener
Add results to model card from Open LLM Leaderboard
Hallucinations Leaderboard
View and submit LLM evaluations
Nexus Function Calling Leaderboard
Visualize model performance on function calling tasks
Tofu Leaderboard
Explore unlearning performance metrics of language models
Enterprise Scenarios Leaderboard
MVBench Leaderboard
Submit model evaluation and view leaderboard
Leaderboard / SeaEval
Browse leaderboard insights across various NLP tasks
Yet Another LLM Leaderboard
Run a Streamlit web app
LLM Safety Leaderboard
View and submit machine learning model evaluations
Japanese Chatbot Arena Leaderboard
Compare two chatbots and vote on the better one
NPHardEval Leaderboard
Explore and compare LLM models through a leaderboard
Open Ita Llm Leaderboard
Track, rank and evaluate open LLMs in the italian language!
Open Chinese LLM Leaderboard
Display and filter LLM benchmark results
EQ Bench
View EQ-Bench Leaderboard for LLMs
Open PL LLM Leaderboard
Browse and filter LLM benchmark results
Backend
Display and run auto evaluation logs
Berkeley Function Calling Leaderboard
Powered By Intel Leaderboard
Evaluate and submit open-source LLMs for ranking on Intel's leaderboard
Salad Bench Leaderboard
Display model leaderboard from Excel data
Open Multilingual Reasoning Leaderboard
Display and search a leaderboard of math models
Hebrew LLM Leaderboard
LLM Forecasting Leaderboard
Run benchmark tests for AI tools
Indic Llm Leaderboard
Browse and compare Indic language LLMs on a leaderboard