Spaces:
Sleeping
Sleeping
Update README.md
Browse files
README.md
CHANGED
|
@@ -8,13 +8,13 @@ sdk_version: 1.42.2
|
|
| 8 |
app_file: app.py
|
| 9 |
pinned: false
|
| 10 |
license: mit
|
| 11 |
-
short_description:
|
| 12 |
---
|
| 13 |
|
| 14 |
|
| 15 |
# AutoBench 1.0 Demo
|
| 16 |
|
| 17 |
-
This Space runs a
|
| 18 |
|
| 19 |
## Features
|
| 20 |
|
|
|
|
| 8 |
app_file: app.py
|
| 9 |
pinned: false
|
| 10 |
license: mit
|
| 11 |
+
short_description: Collective-Model-As-Judge LLM Benchmark
|
| 12 |
---
|
| 13 |
|
| 14 |
|
| 15 |
# AutoBench 1.0 Demo
|
| 16 |
|
| 17 |
+
This Space runs a Collective-Model-As-Judge LLM benchmark to compare different language models using Hugging Face's Inference API. This is a simplified version of Autobench 1.0 which relies on multiple inference providers to manage request load and a wider range of models (Anthropic, Grok, Nebius, OpenAI, Together AI, Vertex AI). For more advanced use, please refer to the AutoBench 1.0 repository.
|
| 18 |
|
| 19 |
## Features
|
| 20 |
|