【Evaluation】Best practice for evaluating Qwen3 !!
#2
by
wangxingjun778
- opened
For more details, please refer to: https://evalscope.readthedocs.io/en/latest/best_practice/qwen3.html
Power by: EvalScope https://github.com/modelscope/evalscope
- Speed Benchmark
- Benchmark collection (for evaluating abilities such as code、understanding、instruction following、math ...)
NOTE: The result is based on samples of original benchmarks with eval arg
--limit
- Thinking efficiency of Qwen3
- Run Gradio visualization
evalscope app
Get started and have fun ! :)
Do you have resources to test this finetuned model? Ty
https://huggingface.co/wzx111/Qwen3-1.7B-MATH-GDPO