leaderboards - a MoritzLaurer Collection

MoritzLaurer 's Collections

prompt-templates

Zeroshot Classifiers

other-interesting

code generation

leaderboards

updated Apr 2

Running

4.53k

4.53k

Chatbot Arena Leaderboard

🏆

Display chatbot leaderboard and stats
Running on CPU Upgrade

13.3k

13.3k

Open LLM Leaderboard

🏆

Track, rank and evaluate open LLMs and chatbots
Running on CPU Upgrade

6.05k

6.05k

MTEB Leaderboard

🥇

Embedding Leaderboard
Running on CPU Upgrade

926

926

Open ASR Leaderboard

🏆

Request evaluation for a speech model
Running

532

532

LLM-Perf Leaderboard

🏆

Explore LLM performance across hardware
Running

1.37k

1.37k

Big Code Models Leaderboard

📈

Search and submit code models for evaluation
Runtime error

78

78

Human & GPT-4 Evaluation of LLMs Leaderboard

👩
Running

445

445

Can Ai Code Results

🏆

Can AI Code? An LLM leaderboard inclquantized models.
Running on CPU Upgrade

143

143

Hallucinations Leaderboard

🔥

View and submit LLM evaluations
Runtime error

105

105

Enterprise Scenarios Leaderboard

🥇
Running on CPU Upgrade

93

93

LLM Safety Leaderboard

🥇

View and submit machine learning model evaluations
Running

552

552

Vision Arena (Testing VLMs side-by-side)

🖼

Analyze images to detect and label objects
Running

67

67

CyberSecEvalTest

📈

Evaluate LLM cybersecurity risks
Running

353

353

LLM Performance Leaderboard

🐨

View LLM Performance Leaderboard
Runtime error

73

73

AIR-Bench Leaderboard

🥇

Explore and compare QA and long doc benchmarks
Running on CPU Upgrade

825

825

Open VLM Leaderboard

🌎

VLMEvalKit Evaluation Results Collection
Running

385

385

Reward Bench Leaderboard

📐

Display and filter model evaluation results
Running

215

215

BigCodeBench Leaderboard

🥇

Explore and analyze code evaluation data
Running

10

10

MJ Bench Leaderboard

🥇

Display and filter multimodal model leaderboard results
Running

113

113

MTEB Arena

⚔

Display text-to-text translation interface
Runtime error

151

151

Open LLM Progress Tracker

🔬

Visualize Open vs. Proprietary LLM Progress
Running

105

105

Judge Arena

💻

Vote on AI responses to rank models
Running

403

403

TTS Spaces Arena

🤗

Blind vote on HF TTS models!
Running

134

134

smolagents LLM leaderboard

🏆

A leaderboard for LLMs powering smolagents