vllm (pretrained=/root/autodl-tmp/DistilQwen2.5-DS3-0324-32B,add_bos_token=true,max_model_len=3096,dtype=bfloat16,tensor_parallel_size=4), gen_kwargs: (None), limit: 250.0, num_fewshot: 5, batch_size: auto
Tasks | Version | Filter | n-shot | Metric | Value | Stderr | ||
---|---|---|---|---|---|---|---|---|
gsm8k | 3 | flexible-extract | 5 | exact_match | ↑ | 0.564 | ± | 0.0314 |
strict-match | 5 | exact_match | ↑ | 0.376 | ± | 0.0307 |
vllm (pretrained=/root/autodl-tmp/DistilQwen2.5-DS3-0324-32B,add_bos_token=true,max_model_len=5096,dtype=bfloat16,tensor_parallel_size=4), gen_kwargs: (None), limit: 500.0, num_fewshot: 5, batch_size: auto
Tasks | Version | Filter | n-shot | Metric | Value | Stderr | ||
---|---|---|---|---|---|---|---|---|
gsm8k | 3 | flexible-extract | 5 | exact_match | ↑ | 0.570 | ± | 0.0222 |
strict-match | 5 | exact_match | ↑ | 0.412 | ± | 0.0220 |
vllm (pretrained=/root/autodl-tmp/DistilQwen2.5-DS3-0324-32B,add_bos_token=true,max_model_len=3048,dtype=bfloat16,tensor_parallel_size=4), gen_kwargs: (None), limit: 5.0, num_fewshot: None, batch_size: 1
Groups | Version | Filter | n-shot | Metric | Value | Stderr | ||
---|---|---|---|---|---|---|---|---|
mmlu | 2 | none | acc | ↑ | 0.8421 | ± | 0.0200 | |
- humanities | 2 | none | acc | ↑ | 0.8308 | ± | 0.0421 | |
- other | 2 | none | acc | ↑ | 0.8308 | ± | 0.0435 | |
- social sciences | 2 | none | acc | ↑ | 0.8833 | ± | 0.0333 | |
- stem | 2 | none | acc | ↑ | 0.8316 | ± | 0.0380 |
vllm (pretrained=/root/autodl-tmp/DistilQwen2.5-DS3-0324-32B-awq,add_bos_token=true,max_model_len=3096,dtype=bfloat16,tensor_parallel_size=4), gen_kwargs: (None), limit: 250.0, num_fewshot: 5, batch_size: auto
Tasks | Version | Filter | n-shot | Metric | Value | Stderr | ||
---|---|---|---|---|---|---|---|---|
gsm8k | 3 | flexible-extract | 5 | exact_match | ↑ | 0.508 | ± | 0.0317 |
strict-match | 5 | exact_match | ↑ | 0.380 | ± | 0.0308 |
vllm (pretrained=/root/autodl-tmp/DistilQwen2.5-DS3-0324-32B-awq,add_bos_token=true,max_model_len=5096,dtype=bfloat16,tensor_parallel_size=4), gen_kwargs: (None), limit: 500.0, num_fewshot: 5, batch_size: auto
Tasks | Version | Filter | n-shot | Metric | Value | Stderr | ||
---|---|---|---|---|---|---|---|---|
gsm8k | 3 | flexible-extract | 5 | exact_match | ↑ | 0.542 | ± | 0.0223 |
strict-match | 5 | exact_match | ↑ | 0.394 | ± | 0.0219 |
vllm (pretrained=/root/autodl-tmp/DistilQwen2.5-DS3-0324-32B-awq,add_bos_token=true,max_model_len=3048,dtype=bfloat16,tensor_parallel_size=4), gen_kwargs: (None), limit: 5.0, num_fewshot: None, batch_size: 1
Groups | Version | Filter | n-shot | Metric | Value | Stderr | ||
---|---|---|---|---|---|---|---|---|
mmlu | 2 | none | acc | ↑ | 0.8526 | ± | 0.0192 | |
- humanities | 2 | none | acc | ↑ | 0.8308 | ± | 0.0421 | |
- other | 2 | none | acc | ↑ | 0.8308 | ± | 0.0449 | |
- social sciences | 2 | none | acc | ↑ | 0.8833 | ± | 0.0333 | |
- stem | 2 | none | acc | ↑ | 0.8632 | ± | 0.0333 |
- Downloads last month
- 22
Model tree for noneUsername/DistilQwen2.5-DS3-0324-32B-awq
Base model
alibaba-pai/DistilQwen2.5-DS3-0324-32B