223
AI2 WildBench Leaderboard (V2)
🦁
Display and explore model leaderboards and chat history
Display and explore model leaderboards and chat history
Display chatbot leaderboard and stats
Track, rank and evaluate open LLMs and chatbots
Embedding Leaderboard
Explore LLM performance across hardware
Search and submit code models for evaluation
Request evaluation for a speech model
Display and filter model evaluation results
Jailbreak the LLM and privacy guardrails