213
BigCodeBench Leaderboard
π₯
Explore and analyze code evaluation data
Explore and analyze code evaluation data
Uncensored General Intelligence Leaderboard
Display chatbot leaderboard and stats
Embedding Leaderboard
Track, rank and evaluate open LLMs and chatbots
Search and submit code models for evaluation
Display a web page
Request evaluation for a speech model
Image Generation and Image Editing Arena & Leaderboard
View LLM Performance Leaderboard
Render a leaderboard for model evaluation
imgsys.org -- arena for text guided image generation
Embed and use ZeroEval for evaluation tasks
Generate interactive React app data visualizations
Blind vote on HF TTS models!
Tracks perf of LLMs, VLMs and agents on web navigation tasks
DABstep Reasoning Benchmark Leaderboard
Ranking of LLMs for agentic tasks