Spaces

·

The AI App Directory

New Space What is Spaces?

Configuration error

Hallucination Evaluation Leaderboard

Open LLM Leaderboard Results PR Opener

Add results to model card from Open LLM Leaderboard

Running on CPU Upgrade

Hallucinations Leaderboard

View and submit LLM evaluations

Nexus Function Calling Leaderboard

Visualize model performance on function calling tasks

Tofu Leaderboard

Explore unlearning performance metrics of language models

Enterprise Scenarios Leaderboard

MVBench Leaderboard

Submit model evaluation and view leaderboard

Leaderboard / SeaEval

Browse leaderboard insights across various NLP tasks

Yet Another LLM Leaderboard

Run a Streamlit web app

Running on CPU Upgrade

LLM Safety Leaderboard

View and submit machine learning model evaluations

Japanese Chatbot Arena Leaderboard

Compare two chatbots and vote on the better one

NPHardEval Leaderboard

Explore and compare LLM models through a leaderboard

Running on CPU Upgrade

Open Ita Llm Leaderboard

Track, rank and evaluate open LLMs in the italian language!

Running on CPU Upgrade

Open Chinese LLM Leaderboard

Display and filter LLM benchmark results

EQ Bench

View EQ-Bench Leaderboard for LLMs

Running on CPU Upgrade

Open PL LLM Leaderboard

Browse and filter LLM benchmark results

Running on CPU Upgrade

Backend

Display and run auto evaluation logs

Configuration error

Berkeley Function Calling Leaderboard

Powered By Intel Leaderboard

Evaluate and submit open-source LLMs for ranking on Intel's leaderboard

Salad Bench Leaderboard

Display model leaderboard from Excel data

Open Multilingual Reasoning Leaderboard

Display and search a leaderboard of math models

Running on CPU Upgrade

Hebrew LLM Leaderboard

LLM Forecasting Leaderboard

Run benchmark tests for AI tools

Indic Llm Leaderboard

Browse and compare Indic language LLMs on a leaderboard