model_trace / model-tracing /results /l2 /model_pairs_l2.csv
Ahmed Ahmed
Add model-tracing code for p-value computation (without binary files)
de071e9
Model Pair,l2
meta-llama/Llama-2-7b-hf vs codellama/CodeLlama-7b-hf,93.18718912197232
meta-llama/Llama-2-7b-hf vs openlm-research/open_llama_7b,122.27028296821305
meta-llama/Llama-2-7b-hf vs huggyllama/llama-7b,122.27950493986255
meta-llama/Llama-2-7b-hf vs lmsys/vicuna-7b-v1.5,4.1913064124248285
meta-llama/Llama-2-7b-hf vs EleutherAI/llemma_7b,88.74339722102076
meta-llama/Llama-2-7b-hf vs lmsys/vicuna-7b-v1.1,122.32382946735395
meta-llama/Llama-2-7b-hf vs microsoft/Orca-2-7b,5.711168968966263
meta-llama/Llama-2-7b-hf vs LLM360/Amber,148.0270618556701
codellama/CodeLlama-7b-hf vs openlm-research/open_llama_7b,140.50532547577853
codellama/CodeLlama-7b-hf vs huggyllama/llama-7b,140.80544442041523
codellama/CodeLlama-7b-hf vs lmsys/vicuna-7b-v1.5,93.25281141868513
codellama/CodeLlama-7b-hf vs EleutherAI/llemma_7b,49.639849790592784
codellama/CodeLlama-7b-hf vs lmsys/vicuna-7b-v1.1,140.8399383650519
codellama/CodeLlama-7b-hf vs microsoft/Orca-2-7b,93.11375432525952
codellama/CodeLlama-7b-hf vs LLM360/Amber,163.1440311418685
openlm-research/open_llama_7b vs huggyllama/llama-7b,131.92685513316152
openlm-research/open_llama_7b vs lmsys/vicuna-7b-v1.5,122.38108086340206
openlm-research/open_llama_7b vs EleutherAI/llemma_7b,133.5973724048443
openlm-research/open_llama_7b vs lmsys/vicuna-7b-v1.1,131.96828017611685
openlm-research/open_llama_7b vs microsoft/Orca-2-7b,120.33676200259515
openlm-research/open_llama_7b vs LLM360/Amber,156.87263745704468
huggyllama/llama-7b vs lmsys/vicuna-7b-v1.5,122.38058419243987
huggyllama/llama-7b vs EleutherAI/llemma_7b,134.03754865916954
huggyllama/llama-7b vs lmsys/vicuna-7b-v1.1,3.2305828500859106
huggyllama/llama-7b vs microsoft/Orca-2-7b,120.63673226643598
huggyllama/llama-7b vs LLM360/Amber,156.8477233676976
lmsys/vicuna-7b-v1.5 vs EleutherAI/llemma_7b,88.80338316392734
lmsys/vicuna-7b-v1.5 vs lmsys/vicuna-7b-v1.1,122.41473367697594
lmsys/vicuna-7b-v1.5 vs microsoft/Orca-2-7b,6.786975900194637
lmsys/vicuna-7b-v1.5 vs LLM360/Amber,148.06894329896906
EleutherAI/llemma_7b vs lmsys/vicuna-7b-v1.1,134.06379757785467
EleutherAI/llemma_7b vs microsoft/Orca-2-7b,88.6362254000865
EleutherAI/llemma_7b vs LLM360/Amber,156.21647923875432
lmsys/vicuna-7b-v1.1 vs microsoft/Orca-2-7b,120.66557634083046
lmsys/vicuna-7b-v1.1 vs LLM360/Amber,156.87929553264604
microsoft/Orca-2-7b vs LLM360/Amber,145.12435121107268