LLM Evaluation - a zk67 Collection

zk67 's Collections

Foundation Models and AGI

Model Architecture

Instruction Tuning

inference optimization

LLM Reasoning Papers

LLM Tech Report

LLM Post Training

LLM Evaluation

updated Jan 20

Judging LLM-as-a-judge with MT-Bench and Chatbot Arena

Paper • 2306.05685 • Published Jun 9, 2023 • 36

Note MT-Bench and Arena MT-Bench-101 https://arxiv.org/abs/2402.14762