EdgeRunner-Tactical-7B
Introduction
EdgeRunner-Tactical-7B is a powerful and efficient language model for the edge. Our mission is to build Generative AI for the edge that is safe, secure, and transparent. To that end, the EdgeRunner team is proud to release EdgeRunner-Tactical-7B, the most powerful language model for its size to date.
EdgeRunner-Tactical-7B is a 7 billion parameter language model that delivers powerful performance while demonstrating the potential of running state-of-the-art (SOTA) models at the edge.
Highlights
- 7 billion parameters that balance power and efficiency
- SOTA performance within the 7B model range
- Initialized from Qwen2-Instruct, leveraging prior advancements
- Self-Play Preference Optimization (SPPO) applied for continuous training and alignment
- Competitive performance on several benchmarks with Meta’s Llama-3-70B, Mixtral 8x7B, and Yi 34B
- Context length of 128K tokens, ideal for extensive conversations and large-scale text tasks
Quickstart
Below is a code snippet to show you how to load the tokenizer and model, and how to generate contents.
from transformers import AutoModelForCausalLM, AutoTokenizer
device = "cuda"
model = AutoModelForCausalLM.from_pretrained(
"edgerunner-ai/EdgeRunner-Tactical-7B",
torch_dtype="auto",
device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained("edgerunner-ai/EdgeRunner-Tactical-7B")
prompt = "Give me a short introduction to large language model."
messages = [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": prompt}
]
text = tokenizer.apply_chat_template(
messages,
tokenize=False,
add_generation_prompt=True
)
model_inputs = tokenizer([text], return_tensors="pt").to(device)
generated_ids = model.generate(
model_inputs.input_ids,
max_new_tokens=512
)
generated_ids = [
output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
]
response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
Example Outputs
Create a Quantum Future:
Ask for a structured JSON output:
Evaluation
In this section, we report the results for EdgeRunner-Tactical-7B models on standard automatic benchmarks. Below are the results.
Arena-Hard Benchmark
Model |
Score |
CI |
Avg Tokens |
gpt-4-turbo-2024-04-09 |
82.63 |
(-1.71, +1.57) |
662.0 |
claude-3-5-sonnet-20240620 |
79.35 |
(-1.45, +2.06) |
567.0 |
gpt-4o-2024-05-13 |
79.21 |
(-1.50, +1.66) |
696.0 |
gpt-4-0125-preview |
77.96 |
(-2.12, +1.63) |
619.0 |
gpt-4o-mini |
74.94 |
(-2.40, +1.75) |
668.0 |
gemini-1.5-pro-api-0514 |
71.96 |
(-2.39, +2.10) |
676.0 |
yi-large-preview |
71.48 |
(-2.03, +3.14) |
720.0 |
glm-4-0520 |
63.84 |
(-2.72, +1.81) |
636.0 |
yi-large |
63.7 |
(-2.72, +2.21) |
626.0 |
deepseek-coder-v2 |
62.3 |
(-1.73, +2.41) |
578.0 |
claude-3-opus-20240229 |
60.36 |
(-2.84, +2.75) |
541.0 |
gemma-2-27b-it |
57.51 |
(-2.35, +2.46) |
577.0 |
glm-4-0116 |
55.72 |
(-2.51, +2.31) |
622.0 |
gemini-1.5-pro-api-0409-preview |
53.37 |
(-2.53, +1.89) |
478.0 |
glm-4-air |
50.88 |
(-2.60, +2.45) |
619.0 |
gpt-4-0314 |
50.0 |
(-0.00, +0.00) |
423.0 |
gemini-1.5-flash-api-0514 |
49.61 |
(-2.93, +2.85) |
642.0 |
qwen2-72b-instruct |
46.86 |
(-2.51, +2.22) |
515.0 |
claude-3-sonnet-20240229 |
46.8 |
(-2.94, +2.35) |
552.0 |
llama-3-70b-instruct |
46.57 |
(-2.00, +2.66) |
591.0 |
claude-3-haiku-20240307 |
41.47 |
(-2.15, +2.65) |
505.0 |
gpt-4-0613 |
37.9 |
(-2.21, +2.51) |
354.0 |
mistral-large-2402 |
37.71 |
(-1.88, +2.77) |
400.0 |
EdgeRunner-Tactical-7B |
37.47 |
(-2.74, +2.57) |
721.0 |
mixtral-8x22b-instruct-v0.1 |
36.36 |
(-2.61, +2.60) |
430.0 |
qwen1.5-72b-chat |
36.12 |
(-2.81, +2.39) |
474.0 |
phi-3-medium-4k-instruct |
33.37 |
(-2.02, +2.25) |
517.0 |
mistral-medium |
31.9 |
(-2.54, +2.13) |
485.0 |
phi-3-small-8k-instruct |
29.77 |
(-2.16, +2.02) |
568.0 |
mistral-next |
27.37 |
(-1.90, +1.99) |
297.0 |
qwen2-7b-instruct |
25.2 |
(-1.55, +2.46) |
618.0 |
gpt-3.5-turbo-0613 |
24.82 |
(-2.15, +1.90) |
401.0 |
claude-2.0 |
23.99 |
(-1.90, +1.75) |
295.0 |
Arcee-Spark |
23.52 |
(-2.03, +1.73) |
622.0 |
mixtral-8x7b-instruct-v0.1 |
23.4 |
(-1.87, +1.73) |
457.0 |
gpt-3.5-turbo-0125 |
23.34 |
(-1.46, +2.31) |
329.0 |
yi-34b-chat |
23.15 |
(-2.15, +1.85) |
611.0 |
starling-lm-7b-beta |
23.01 |
(-1.98, +1.71) |
530.0 |
claude-2.1 |
22.77 |
(-1.48, +2.38) |
290.0 |
llama-3-8b-instruct |
20.56 |
(-1.65, +2.09) |
585.0 |
gpt-3.5-turbo-1106 |
18.87 |
(-1.79, +2.34) |
285.0 |
gpt-3.5-turbo-0314 |
18.05 |
(-1.47, +2.09) |
334.0 |
gemini-pro |
17.8 |
(-1.65, +1.54) |
322.0 |
phi-3-mini-128k-instruct |
15.43 |
(-1.71, +1.60) |
609.0 |
mistral-7b-instruct |
12.57 |
(-1.58, +1.54) |
541.0 |
gemma-1.1-7b-it |
12.09 |
(-1.35, +1.56) |
341.0 |
llama-2-70b-chat |
11.55 |
(-1.18, +1.27) |
595.0 |
vicuna-33b |
8.63 |
(-0.88, +1.28) |
451.0 |
gemma-7b-it |
7.47 |
(-1.05, +1.09) |
378.0 |
gemma-1.1-2b-it |
3.37 |
(-0.67, +0.70) |
316.0 |
gemma-2b-it |
3.0 |
(-0.68, +0.62) |
369.0 |
InfiniteBench
Task Name |
GPT-4 |
YaRN-Mistral-7B |
Kimi-Chat |
Claude 2 |
Yi-6B-200K |
Yi-34B-200K |
Chatglm3-6B-128K |
EdgeRunner-Tactical-7B |
Qwen2-7B-Instruct |
Retrieve.PassKey |
100% |
92.71% |
98.14% |
97.80% |
100.00% |
100.00% |
92.20% |
100% |
100% |
Retrieve.Number |
100% |
56.61% |
95.42% |
98.14% |
94.92% |
100.00% |
80.68% |
100% |
99.83% |
Retrieve.KV |
89.00% |
< 5% |
53.60% |
65.40% |
< 5% |
< 5% |
< 5% |
2.2% |
1.8% |
En.Sum |
14.73% |
9.09% |
17.96% |
14.50% |
< 5% |
< 5% |
< 5% |
33.07% |
29.13% |
En.QA |
22.44% |
9.55% |
16.52% |
11.97% |
9.20% |
12.17% |
< 5% |
3.4% |
9.09% |
En.MC |
67.25% |
27.95% |
72.49% |
62.88% |
36.68% |
38.43% |
10.48% |
66.81% |
66.37% |
En.Dia |
8.50% |
7.50% |
11.50% |
46.50% |
< 5% |
< 5% |
< 5% |
29% |
17% |
Zh.QA |
25.96% |
16.98% |
17.93% |
9.64% |
15.07% |
13.61% |
< 5% |
4.6% |
11.14% |
Code.Debug |
37.06% |
< 5% |
17.77% |
< 5% |
9.14% |
13.96% |
7.36% |
22.08% |
24.61% |
Code.Run |
23.25% |
< 5% |
< 5% |
< 5% |
< 5% |
< 5% |
< 5% |
0% |
0.5% |
Math.Calc |
< 5% |
< 5% |
< 5% |
< 5% |
< 5% |
< 5% |
< 5% |
0% |
0% |
Math.Find |
60.00% |
17.14% |
12.57% |
32.29% |
< 5% |
25.71% |
7.71% |
29.14% |
31.42% |
GSM@ZeroEval
Model |
Acc |
No answer |
Reason Lens |
Llama-3.1-405B-Instruct-Turbo |
95.91 |
0.08 |
365.07 |
claude-3-5-sonnet-20240620 |
95.6 |
0 |
465.19 |
claude-3-opus-20240229 |
95.6 |
0 |
410.62 |
gpt-4o-2024-05-13 |
95.38 |
0 |
479.98 |
gpt-4o-mini-2024-07-18 |
94.24 |
0 |
463.71 |
deepseek-chat |
93.93 |
0 |
495.52 |
deepseek-coder |
93.78 |
0 |
566.89 |
gemini-1.5-pro |
93.4 |
0 |
389.17 |
Meta-Llama-3-70B-Instruct |
93.03 |
0 |
352.05 |
Qwen2-72B-Instruct |
92.65 |
0 |
375.96 |
claude-3-sonnet-20240229 |
91.51 |
0 |
762.69 |
gemini-1.5-flash |
91.36 |
0 |
344.61 |
gemma-2-27b-it@together |
90.22 |
0 |
364.68 |
claude-3-haiku-20240307 |
88.78 |
0 |
587.65 |
gemma-2-9b-it |
87.41 |
0 |
394.83 |
reka-core-20240501 |
87.41 |
0.08 |
414.7 |
Athene-70B |
86.66 |
0.3 |
253.53 |
Yi-1.5-34B-Chat |
84.08 |
0.08 |
553.47 |
Llama-3.1-8B-Instruct |
82.87 |
0.45 |
414.19 |
Mistral-Nemo-Instruct-2407 |
82.79 |
0 |
349.81 |
yi-large-preview |
82.64 |
0 |
514.25 |
EdgeRunner-Tactical-7B |
81.12 |
0.08 |
615.89 |
gpt-3.5-turbo-0125 |
80.36 |
0 |
350.97 |
command-r-plus |
80.14 |
0.08 |
294.08 |
Qwen2-7B-Instruct |
80.06 |
0 |
452.6 |
yi-large |
80.06 |
0 |
479.87 |
Yi-1.5-9B-Chat |
76.42 |
0.08 |
485.39 |
Phi-3-mini-4k-instruct |
75.51 |
0 |
462.53 |
reka-flash-20240226 |
74.68 |
0.45 |
460.06 |
Mixtral-8x7B-Instruct-v0.1 |
70.13 |
2.27 |
361.12 |
command-r |
52.99 |
0 |
294.43 |
Qwen2-1.5B-Instruct |
43.37 |
4.78 |
301.67 |
MMLU-REDUX@ZeroEval
Model |
Acc |
No answer |
Reason Lens |
gpt-4o-2024-05-13 |
88.01 |
0.14 |
629.79 |
claude-3-5-sonnet-20240620 |
86 |
0.18 |
907.1 |
Llama-3.1-405B-Instruct-Turbo |
85.64 |
0.76 |
449.71 |
gpt-4-turbo-2024-04-09 |
85.31 |
0.04 |
631.38 |
gemini-1.5-pro |
82.76 |
1.94 |
666.7 |
claude-3-opus-20240229 |
82.54 |
0.58 |
500.35 |
yi-large-preview |
82.15 |
0.14 |
982.6 |
gpt-4-0314 |
81.64 |
0.04 |
397.22 |
Qwen2-72B-Instruct |
81.61 |
0.29 |
486.41 |
gpt-4o-mini-2024-07-18 |
81.5 |
0.07 |
526 |
yi-large |
81.17 |
0 |
774.85 |
deepseek-chat |
80.81 |
0.11 |
691.91 |
deepseek-coder |
79.63 |
0.14 |
704.72 |
Meta-Llama-3-70B-Instruct |
78.01 |
0.11 |
520.77 |
gemini-1.5-flash |
77.36 |
1.26 |
583.45 |
Athene-70B |
76.64 |
0.04 |
552.61 |
reka-core-20240501 |
76.42 |
0.76 |
701.67 |
gemma-2-27b-it@together |
75.67 |
0.61 |
446.51 |
claude-3-sonnet-20240229 |
74.87 |
0.07 |
671.75 |
gemma-2-9b-it@nvidia |
72.82 |
0.76 |
499 |
Yi-1.5-34B-Chat |
72.79 |
1.01 |
620.1 |
claude-3-haiku-20240307 |
72.32 |
0.04 |
644.59 |
Phi-3-mini-4k-instruct |
70.34 |
0.43 |
677.09 |
command-r-plus |
68.61 |
0 |
401.51 |
gpt-3.5-turbo-0125 |
68.36 |
0.04 |
357.92 |
EdgeRunner-Tactical-7B |
67.71 |
0.65 |
917.6 |
Llama-3.1-8B-Instruct |
67.13 |
3.38 |
399.54 |
Qwen2-7B-Instruct |
66.92 |
0.72 |
533.15 |
Mistral-Nemo-Instruct-2407 |
66.88 |
0.47 |
464.19 |
Yi-1.5-9B-Chat |
65.05 |
4.61 |
542.87 |
reka-flash-20240226 |
64.72 |
0.32 |
659.25 |
Mixtral-8x7B-Instruct-v0.1 |
63.17 |
5.51 |
324.31 |
Meta-Llama-3-8B-Instruct |
61.66 |
0.97 |
600.81 |
command-r |
61.12 |
0.04 |
382.23 |
Qwen2-1.5B-Instruct |
41.11 |
7.74 |
280.56 |
WildBench
Model |
WB_Elo |
RewardScore_Avg |
task_macro_reward.K=-1 |
Length |
gpt-4o-2024-05-13 |
1248.12 |
50.05 |
40.80 |
3723.52 |
claude-3-5-sonnet-20240620 |
1229.76 |
46.16 |
37.63 |
2911.85 |
gpt-4-turbo-2024-04-09 |
1225.29 |
46.19 |
37.17 |
3093.17 |
gpt-4-0125-preview |
1211.44 |
41.24 |
30.20 |
3335.64 |
gemini-1.5-pro |
1209.23 |
45.27 |
37.59 |
3247.97 |
yi-large-preview |
1209.00 |
46.92 |
38.54 |
3512.68 |
claude-3-opus-20240229 |
1206.56 |
37.03 |
22.35 |
2685.98 |
Meta-Llama-3-70B-Instruct |
1197.72 |
35.15 |
22.54 |
3046.64 |
Athene-70B |
1197.41 |
29.77 |
0.00 |
3175.14 |
deepseek-coder-v2 |
1194.11 |
29.39 |
11.38 |
2795.31 |
gpt-4o-mini-2024-07-18 |
1192.43 |
28.57 |
0.00 |
3648.13 |
yi-large |
1191.88 |
33.35 |
17.77 |
3095.34 |
gemini-1.5-flash |
1190.30 |
37.45 |
26.04 |
3654.40 |
deepseek-v2-chat-0628 |
1188.07 |
27.00 |
0.00 |
3252.38 |
gemma-2-9b-it-SimPO |
1184.67 |
26.64 |
0.00 |
4277.67 |
gemma-2-9b-it-DPO |
1182.43 |
26.61 |
0.00 |
3982.63 |
nemotron-4-340b-instruct |
1181.77 |
33.76 |
19.85 |
2754.01 |
claude-3-sonnet-20240229 |
1179.81 |
28.09 |
10.70 |
2670.24 |
deepseekv2-chat |
1178.76 |
30.41 |
12.60 |
2896.97 |
gemma-2-27b-it@together |
1178.34 |
24.27 |
0.00 |
2924.55 |
Qwen2-72B-Instruct |
1176.75 |
24.77 |
5.03 |
2856.45 |
reka-core-20240501 |
1173.85 |
31.48 |
17.06 |
2592.59 |
Mistral-Nemo-Instruct-2407 |
1165.29 |
22.19 |
0.00 |
3318.21 |
Yi-1.5-34B-Chat |
1163.69 |
30.83 |
16.06 |
3523.56 |
EdgeRunner-Tactical-7B |
1162.88 |
22.26 |
0.00 |
3754.66 |
claude-3-haiku-20240307 |
1160.56 |
16.30 |
-6.30 |
2601.03 |
mistral-large-2402 |
1159.72 |
13.27 |
-12.36 |
2514.98 |
deepseek-v2-coder-0628 |
1155.97 |
22.83 |
0.00 |
2580.18 |
gemma-2-9b-it |
1154.30 |
21.35 |
0.00 |
2802.89 |
command-r-plus |
1153.15 |
16.58 |
-3.60 |
3293.81 |
glm-4-9b-chat |
1152.68 |
20.71 |
2.33 |
3692.04 |
Qwen1.5-72B-Chat-greedy |
1151.97 |
20.83 |
1.72 |
2392.36 |
Yi-1.5-9B-Chat |
1151.43 |
21.80 |
4.93 |
3468.23 |
Meta-Llama-3-8B-Instruct |
1140.76 |
6.72 |
-15.76 |
2975.19 |
Qwen2-7B-Instruct |
1137.66 |
16.20 |
0.00 |
3216.43 |
Starling-LM-7B-beta-ExPO |
1137.58 |
11.28 |
-9.01 |
2835.83 |
Hermes-2-Theta-Llama-3-8B |
1135.99 |
3.18 |
-23.28 |
2742.17 |
Llama-3.1-8B-Instruct |
1135.42 |
16.38 |
0.00 |
3750.60 |
AlpacaEval 2.0
Model |
Length Controlled Winrate |
Win Rate |
N Total |
Avg Length |
gpt-4o-2024-05-13 |
57.46 |
51.33 |
805 |
1873 |
gpt-4-turbo-2024-04-09 |
55.02 |
46.12 |
805 |
1802 |
claude-3-5-sonnet-20240620 |
52.37 |
40.56 |
805 |
1488 |
yi-large-preview |
51.89 |
57.47 |
805 |
2335 |
gpt4_1106_preview |
50.0 |
50.0 |
805 |
2049 |
Qwen1.5-110B-Chat |
43.91 |
33.78 |
805 |
1631 |
claude-3-opus-20240229 |
40.51 |
29.11 |
805 |
1388 |
gpt4 |
38.13 |
23.58 |
805 |
1365 |
Qwen1.5-72B-Chat |
36.57 |
26.5 |
805 |
1549 |
gpt4_0314 |
35.31 |
22.07 |
805 |
1371 |
Meta-Llama-3-70B-Instruct |
34.42 |
33.18 |
805 |
1919 |
EdgeRunner-Tactical-7B |
34.41 |
51.28 |
805 |
2735 |
mistral-large-2402 |
32.65 |
21.44 |
805 |
1362 |
Mixtral-8x22B-Instruct-v0.1 |
30.88 |
22.21 |
805 |
1445 |
gpt4_0613 |
30.18 |
15.76 |
805 |
1140 |
mistral-medium |
28.61 |
21.86 |
805 |
1500 |
claude-2 |
28.16 |
17.19 |
805 |
1069 |
Samba-CoE-v0.2 |
27.62 |
21.85 |
805 |
1469 |
internlm2-chat-20b-ExPO |
27.23 |
46.19 |
805 |
3335 |
Yi-34B-Chat |
27.19 |
29.66 |
805 |
2123 |
Starling-LM-7B-beta-ExPO |
26.41 |
29.6 |
805 |
2215 |
Llama-3.1-8B-Instruct |
26.41 |
30.32 |
805 |
2171 |
Snorkel-Mistral-PairRM-DPO |
26.39 |
30.22 |
804 |
2736 |
Arcee-Spark |
25.58 |
26.19 |
805 |
2002 |
claude-2.1 |
25.25 |
15.73 |
805 |
1096 |
gemini-pro |
24.38 |
18.18 |
805 |
1456 |
Qwen1.5-14B-Chat |
23.9 |
18.65 |
805 |
1607 |
Mixtral-8x7B-Instruct-v0.1 |
23.69 |
18.26 |
805 |
1465 |
Meta-Llama-3-8B-Instruct |
22.92 |
22.57 |
805 |
1899 |
Samba-CoE-v0.1 |
22.87 |
16.84 |
805 |
1316 |
gpt-3.5-turbo-0613 |
22.35 |
14.1 |
805 |
1331 |
Qwen2-7B-Instruct |
21.51 |
18.93 |
805 |
1793 |
gpt-3.5-turbo-1106 |
19.3 |
9.18 |
805 |
796 |
internlm2-chat-20b-ppo |
18.75 |
21.75 |
805 |
2373 |
claude-2.1_concise |
18.21 |
9.23 |
805 |
573 |
gpt-3.5-turbo-0301 |
18.09 |
9.62 |
805 |
827 |
deepseek-llm-67b-chat |
17.84 |
12.09 |
805 |
1151 |
vicuna-33b-v1.3 |
17.57 |
12.71 |
805 |
1479 |
Mistral-7B-Instruct-v0.2 |
17.11 |
14.72 |
805 |
1676 |
OpenHermes-2.5-Mistral-7B |
16.25 |
10.34 |
805 |
1107 |
Qwen1.5-7B-Chat |
14.75 |
11.77 |
805 |
1594 |