Spaces:

OpenEvals
/

README

Running

Community Evals Feedback

by burtenshaw - opened 27 days ago

The Hub provides a decentralized system for tracking model evaluation results. Benchmark datasets host leaderboards, and model repos store evaluation scores that automatically appear on both the model page and the benchmark’s leaderboard.

🔊 Let us know what you think of this feature:

djstrong

20 days ago

Looks great! Could you provide instruction how to run locally evaluation using inspect ai on these 3 benchmarks?

SaylorTwift

OpenEvals org 15 days ago

Hey @djstrong you can run something like:

inspect eval hf/cais/hle --model hf/openai-community/gpt2

to run a local transformers model, here are the docs: https://inspect.aisi.org.uk/providers.html

SebastianS

1 day ago

@burtenshaw
Is it possible to configure a subset of the data to be closed and private. I think that would be super valuable

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment