vllm (pretrained=/root/autodl-tmp/DistilQwen2.5-DS3-0324-32B,add_bos_token=true,max_model_len=3096,dtype=bfloat16,tensor_parallel_size=4), gen_kwargs: (None), limit: 250.0, num_fewshot: 5, batch_size: auto

Tasks Version Filter n-shot Metric Value Stderr
gsm8k 3 flexible-extract 5 exact_match ↑ 0.564 ± 0.0314
strict-match 5 exact_match ↑ 0.376 ± 0.0307

vllm (pretrained=/root/autodl-tmp/DistilQwen2.5-DS3-0324-32B,add_bos_token=true,max_model_len=5096,dtype=bfloat16,tensor_parallel_size=4), gen_kwargs: (None), limit: 500.0, num_fewshot: 5, batch_size: auto

Tasks Version Filter n-shot Metric Value Stderr
gsm8k 3 flexible-extract 5 exact_match ↑ 0.570 ± 0.0222
strict-match 5 exact_match ↑ 0.412 ± 0.0220

vllm (pretrained=/root/autodl-tmp/DistilQwen2.5-DS3-0324-32B,add_bos_token=true,max_model_len=3048,dtype=bfloat16,tensor_parallel_size=4), gen_kwargs: (None), limit: 5.0, num_fewshot: None, batch_size: 1

Groups Version Filter n-shot Metric Value Stderr
mmlu 2 none acc ↑ 0.8421 ± 0.0200
- humanities 2 none acc ↑ 0.8308 ± 0.0421
- other 2 none acc ↑ 0.8308 ± 0.0435
- social sciences 2 none acc ↑ 0.8833 ± 0.0333
- stem 2 none acc ↑ 0.8316 ± 0.0380

vllm (pretrained=/root/autodl-tmp/DistilQwen2.5-DS3-0324-32B-awq,add_bos_token=true,max_model_len=3096,dtype=bfloat16,tensor_parallel_size=4), gen_kwargs: (None), limit: 250.0, num_fewshot: 5, batch_size: auto

Tasks Version Filter n-shot Metric Value Stderr
gsm8k 3 flexible-extract 5 exact_match ↑ 0.508 ± 0.0317
strict-match 5 exact_match ↑ 0.380 ± 0.0308

vllm (pretrained=/root/autodl-tmp/DistilQwen2.5-DS3-0324-32B-awq,add_bos_token=true,max_model_len=5096,dtype=bfloat16,tensor_parallel_size=4), gen_kwargs: (None), limit: 500.0, num_fewshot: 5, batch_size: auto

Tasks Version Filter n-shot Metric Value Stderr
gsm8k 3 flexible-extract 5 exact_match ↑ 0.542 ± 0.0223
strict-match 5 exact_match ↑ 0.394 ± 0.0219

vllm (pretrained=/root/autodl-tmp/DistilQwen2.5-DS3-0324-32B-awq,add_bos_token=true,max_model_len=3048,dtype=bfloat16,tensor_parallel_size=4), gen_kwargs: (None), limit: 5.0, num_fewshot: None, batch_size: 1

Groups Version Filter n-shot Metric Value Stderr
mmlu 2 none acc ↑ 0.8526 ± 0.0192
- humanities 2 none acc ↑ 0.8308 ± 0.0421
- other 2 none acc ↑ 0.8308 ± 0.0449
- social sciences 2 none acc ↑ 0.8833 ± 0.0333
- stem 2 none acc ↑ 0.8632 ± 0.0333
Downloads last month
22
Safetensors
Model size
5.73B params
Tensor type
I32
·
BF16
·
FP16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for noneUsername/DistilQwen2.5-DS3-0324-32B-awq

Quantized
(4)
this model