Andrew Jardine
2legit2overfit
AI & ML interests
None yet
Organizations
My leaderboards
-
Running222222
AI2 WildBench Leaderboard (V2)
🦁Display and explore model leaderboards and chat history
-
Running4.5k4.5k
Chatbot Arena Leaderboard
🏆Show chatbot performance leaderboard
-
Running on CPU Upgrade13.3k13.3k
Open LLM Leaderboard
🏆Track, rank and evaluate open LLMs and chatbots
-
Running on CPU Upgrade5.97k5.97k
MTEB Leaderboard
🥇Embedding Leaderboard
My fav papers
-
JudgeLM: Fine-tuned Large Language Models are Scalable Judges
Paper • 2310.17631 • Published • 35 -
Prometheus: Inducing Fine-grained Evaluation Capability in Language Models
Paper • 2310.08491 • Published • 55 -
Chain-of-Thought Reasoning Without Prompting
Paper • 2402.10200 • Published • 110 -
BitDelta: Your Fine-Tune May Only Be Worth One Bit
Paper • 2402.10193 • Published • 23
My Fav datasets
My fav models
My fav papers
-
JudgeLM: Fine-tuned Large Language Models are Scalable Judges
Paper • 2310.17631 • Published • 35 -
Prometheus: Inducing Fine-grained Evaluation Capability in Language Models
Paper • 2310.08491 • Published • 55 -
Chain-of-Thought Reasoning Without Prompting
Paper • 2402.10200 • Published • 110 -
BitDelta: Your Fine-Tune May Only Be Worth One Bit
Paper • 2402.10193 • Published • 23
My leaderboards
-
Running222222
AI2 WildBench Leaderboard (V2)
🦁Display and explore model leaderboards and chat history
-
Running4.5k4.5k
Chatbot Arena Leaderboard
🏆Show chatbot performance leaderboard
-
Running on CPU Upgrade13.3k13.3k
Open LLM Leaderboard
🏆Track, rank and evaluate open LLMs and chatbots
-
Running on CPU Upgrade5.97k5.97k
MTEB Leaderboard
🥇Embedding Leaderboard
My Fav datasets