Evaluation Results
Metric | Value |
---|---|
Avg. | 29.18 |
IFEval (0-Shot) | 64.8 |
BBH (3-Shot) | 35.48 |
MATH Level 5 (4-Shot) | 15.86 |
GPQA (0-Shot) | 10.29 |
MuSR (0-Shot) | 13.47 |
MMLU-PRO (5-Shot) | 35.17 |
Detailed results can be found here. Personal Benchmarks - check PERSONAL_BENCHMARK.md merge
- Downloads last month
- 364
Inference Providers
NEW
This model is not currently available via any of the supported third-party Inference Providers, and
the model is not deployed on the HF Inference API.
Model tree for sethuiyer/Qwen2.5-7B-Anvita
Base model
Qwen/Qwen2.5-7B
Finetuned
Qwen/Qwen2.5-7B-Instruct
Space using sethuiyer/Qwen2.5-7B-Anvita 1
Evaluation results
- strict accuracy on IFEval (0-Shot)Open LLM Leaderboard64.330
- normalized accuracy on BBH (3-Shot)Open LLM Leaderboard35.480
- exact match on MATH Lvl 5 (4-Shot)Open LLM Leaderboard15.860
- acc_norm on GPQA (0-shot)Open LLM Leaderboard10.290
- acc_norm on MuSR (0-shot)Open LLM Leaderboard13.470
- accuracy on MMLU-PRO (5-shot)test set Open LLM Leaderboard35.170