Model Card for Mistral-Small-3.1-24B-Base-2503 (TEXT ONLY)
This is the text-only variant of mistralai/Mistral-Small-3.1-24B-Base-2503. This also serves as the base-model for mistralai/Devstral-Small-2505, which had no official base model released.
Features:
- Text-only, no multimodality.
- 128k context length.
How was a text-only model achieved? The vision encoder was removed and the model architecture was converted from mistral3 to mistral. The tokenizer was not modified.
Reproduced eval
Serve with vLLM:
vllm serve casperhansen/Mistral-Small-3.1-24B-Base-2503-Text-Only
The reproduced results can be seen below.
Model | MMLU (0-shot) |
---|---|
Small 3.1 24B Base (Text Only) | 77.25% ± 0.0033 |
Small 3.1 24B Base (Multimodal) | 77.34% ± 0.0033 |
Original Multimodal: Full MMLU (Reproduced)
lm_eval --model local-completions \
--model_args "base_url=http://localhost:8000/v1/completions,model=mistralai/Mistral-Small-3.1-24B-Base-2503" \
--tasks mmlu \
--batch_size 128
Tasks | Version | Filter | n-shot | Metric | Value | Stderr | ||
---|---|---|---|---|---|---|---|---|
mmlu | 2 | none | acc | ↑ | 0.7734 | ± | 0.0033 | |
- humanities | 2 | none | acc | ↑ | 0.6820 | ± | 0.0062 | |
- formal_logic | 1 | none | 0 | acc | ↑ | 0.5714 | ± | 0.0443 |
- high_school_european_history | 1 | none | 0 | acc | ↑ | 0.8303 | ± | 0.0293 |
- high_school_us_history | 1 | none | 0 | acc | ↑ | 0.9363 | ± | 0.0171 |
- high_school_world_history | 1 | none | 0 | acc | ↑ | 0.9241 | ± | 0.0172 |
- international_law | 1 | none | 0 | acc | ↑ | 0.9091 | ± | 0.0262 |
- jurisprudence | 1 | none | 0 | acc | ↑ | 0.8148 | ± | 0.0376 |
- logical_fallacies | 1 | none | 0 | acc | ↑ | 0.8589 | ± | 0.0274 |
- moral_disputes | 1 | none | 0 | acc | ↑ | 0.8208 | ± | 0.0206 |
- moral_scenarios | 1 | none | 0 | acc | ↑ | 0.3844 | ± | 0.0163 |
- philosophy | 1 | none | 0 | acc | ↑ | 0.8296 | ± | 0.0214 |
- prehistory | 1 | none | 0 | acc | ↑ | 0.8704 | ± | 0.0187 |
- professional_law | 1 | none | 0 | acc | ↑ | 0.6095 | ± | 0.0125 |
- world_religions | 1 | none | 0 | acc | ↑ | 0.8713 | ± | 0.0257 |
- other | 2 | none | acc | ↑ | 0.8317 | ± | 0.0064 | |
- business_ethics | 1 | none | 0 | acc | ↑ | 0.8200 | ± | 0.0386 |
- clinical_knowledge | 1 | none | 0 | acc | ↑ | 0.8679 | ± | 0.0208 |
- college_medicine | 1 | none | 0 | acc | ↑ | 0.7803 | ± | 0.0316 |
- global_facts | 1 | none | 0 | acc | ↑ | 0.6600 | ± | 0.0476 |
- human_aging | 1 | none | 0 | acc | ↑ | 0.7982 | ± | 0.0269 |
- management | 1 | none | 0 | acc | ↑ | 0.9029 | ± | 0.0293 |
- marketing | 1 | none | 0 | acc | ↑ | 0.9359 | ± | 0.0160 |
- medical_genetics | 1 | none | 0 | acc | ↑ | 0.8900 | ± | 0.0314 |
- miscellaneous | 1 | none | 0 | acc | ↑ | 0.9183 | ± | 0.0098 |
- nutrition | 1 | none | 0 | acc | ↑ | 0.8791 | ± | 0.0187 |
- professional_accounting | 1 | none | 0 | acc | ↑ | 0.6277 | ± | 0.0288 |
- professional_medicine | 1 | none | 0 | acc | ↑ | 0.8603 | ± | 0.0211 |
- virology | 1 | none | 0 | acc | ↑ | 0.5602 | ± | 0.0386 |
- social sciences | 2 | none | acc | ↑ | 0.8736 | ± | 0.0059 | |
- econometrics | 1 | none | 0 | acc | ↑ | 0.6491 | ± | 0.0449 |
- high_school_geography | 1 | none | 0 | acc | ↑ | 0.8990 | ± | 0.0215 |
- high_school_government_and_politics | 1 | none | 0 | acc | ↑ | 0.9637 | ± | 0.0135 |
- high_school_macroeconomics | 1 | none | 0 | acc | ↑ | 0.8103 | ± | 0.0199 |
- high_school_microeconomics | 1 | none | 0 | acc | ↑ | 0.9034 | ± | 0.0192 |
- high_school_psychology | 1 | none | 0 | acc | ↑ | 0.9358 | ± | 0.0105 |
- human_sexuality | 1 | none | 0 | acc | ↑ | 0.8855 | ± | 0.0279 |
- professional_psychology | 1 | none | 0 | acc | ↑ | 0.8578 | ± | 0.0141 |
- public_relations | 1 | none | 0 | acc | ↑ | 0.7909 | ± | 0.0390 |
- security_studies | 1 | none | 0 | acc | ↑ | 0.8327 | ± | 0.0239 |
- sociology | 1 | none | 0 | acc | ↑ | 0.9154 | ± | 0.0197 |
- us_foreign_policy | 1 | none | 0 | acc | ↑ | 0.9300 | ± | 0.0256 |
- stem | 2 | none | acc | ↑ | 0.7545 | ± | 0.0073 | |
- abstract_algebra | 1 | none | 0 | acc | ↑ | 0.4600 | ± | 0.0501 |
- anatomy | 1 | none | 0 | acc | ↑ | 0.8148 | ± | 0.0336 |
- astronomy | 1 | none | 0 | acc | ↑ | 0.9211 | ± | 0.0219 |
- college_biology | 1 | none | 0 | acc | ↑ | 0.9444 | ± | 0.0192 |
- college_chemistry | 1 | none | 0 | acc | ↑ | 0.5700 | ± | 0.0498 |
- college_computer_science | 1 | none | 0 | acc | ↑ | 0.7100 | ± | 0.0456 |
- college_mathematics | 1 | none | 0 | acc | ↑ | 0.6200 | ± | 0.0488 |
- college_physics | 1 | none | 0 | acc | ↑ | 0.6569 | ± | 0.0472 |
- computer_security | 1 | none | 0 | acc | ↑ | 0.8300 | ± | 0.0378 |
- conceptual_physics | 1 | none | 0 | acc | ↑ | 0.8170 | ± | 0.0253 |
- electrical_engineering | 1 | none | 0 | acc | ↑ | 0.7931 | ± | 0.0338 |
- elementary_mathematics | 1 | none | 0 | acc | ↑ | 0.7910 | ± | 0.0209 |
- high_school_biology | 1 | none | 0 | acc | ↑ | 0.9323 | ± | 0.0143 |
- high_school_chemistry | 1 | none | 0 | acc | ↑ | 0.7586 | ± | 0.0301 |
- high_school_computer_science | 1 | none | 0 | acc | ↑ | 0.8900 | ± | 0.0314 |
- high_school_mathematics | 1 | none | 0 | acc | ↑ | 0.5185 | ± | 0.0305 |
- high_school_physics | 1 | none | 0 | acc | ↑ | 0.6291 | ± | 0.0394 |
- high_school_statistics | 1 | none | 0 | acc | ↑ | 0.7593 | ± | 0.0292 |
- machine_learning | 1 | none | 0 | acc | ↑ | 0.6250 | ± | 0.0460 |
Groups | Version | Filter | n-shot | Metric | Value | Stderr | ||
---|---|---|---|---|---|---|---|---|
mmlu | 2 | none | acc | ↑ | 0.7734 | ± | 0.0033 | |
- humanities | 2 | none | acc | ↑ | 0.6820 | ± | 0.0062 | |
- other | 2 | none | acc | ↑ | 0.8317 | ± | 0.0064 | |
- social sciences | 2 | none | acc | ↑ | 0.8736 | ± | 0.0059 | |
- stem | 2 | none | acc | ↑ | 0.7545 | ± | 0.0073 |
Text Only: Full MMLU
lm_eval --model local-completions \
--model_args "base_url=http://localhost:8000/v1/completions,model=casperhansen/Mistral-Small-3.1-24B-Base-2503-Text-Only" \
--tasks mmlu \
--batch_size 128
Tasks | Version | Filter | n-shot | Metric | Value | Stderr | ||
---|---|---|---|---|---|---|---|---|
mmlu | 2 | none | acc | ↑ | 0.7725 | ± | 0.0033 | |
- humanities | 2 | none | acc | ↑ | 0.6793 | ± | 0.0062 | |
- formal_logic | 1 | none | 0 | acc | ↑ | 0.5397 | ± | 0.0446 |
- high_school_european_history | 1 | none | 0 | acc | ↑ | 0.8364 | ± | 0.0289 |
- high_school_us_history | 1 | none | 0 | acc | ↑ | 0.9363 | ± | 0.0171 |
- high_school_world_history | 1 | none | 0 | acc | ↑ | 0.9198 | ± | 0.0177 |
- international_law | 1 | none | 0 | acc | ↑ | 0.9008 | ± | 0.0273 |
- jurisprudence | 1 | none | 0 | acc | ↑ | 0.8148 | ± | 0.0376 |
- logical_fallacies | 1 | none | 0 | acc | ↑ | 0.8405 | ± | 0.0288 |
- moral_disputes | 1 | none | 0 | acc | ↑ | 0.8237 | ± | 0.0205 |
- moral_scenarios | 1 | none | 0 | acc | ↑ | 0.3765 | ± | 0.0162 |
- philosophy | 1 | none | 0 | acc | ↑ | 0.8264 | ± | 0.0215 |
- prehistory | 1 | none | 0 | acc | ↑ | 0.8704 | ± | 0.0187 |
- professional_law | 1 | none | 0 | acc | ↑ | 0.6108 | ± | 0.0125 |
- world_religions | 1 | none | 0 | acc | ↑ | 0.8713 | ± | 0.0257 |
- other | 2 | none | acc | ↑ | 0.8339 | ± | 0.0064 | |
- business_ethics | 1 | none | 0 | acc | ↑ | 0.8300 | ± | 0.0378 |
- clinical_knowledge | 1 | none | 0 | acc | ↑ | 0.8679 | ± | 0.0208 |
- college_medicine | 1 | none | 0 | acc | ↑ | 0.7746 | ± | 0.0319 |
- global_facts | 1 | none | 0 | acc | ↑ | 0.6800 | ± | 0.0469 |
- human_aging | 1 | none | 0 | acc | ↑ | 0.8027 | ± | 0.0267 |
- management | 1 | none | 0 | acc | ↑ | 0.9029 | ± | 0.0293 |
- marketing | 1 | none | 0 | acc | ↑ | 0.9402 | ± | 0.0155 |
- medical_genetics | 1 | none | 0 | acc | ↑ | 0.8900 | ± | 0.0314 |
- miscellaneous | 1 | none | 0 | acc | ↑ | 0.9208 | ± | 0.0097 |
- nutrition | 1 | none | 0 | acc | ↑ | 0.8791 | ± | 0.0187 |
- professional_accounting | 1 | none | 0 | acc | ↑ | 0.6312 | ± | 0.0288 |
- professional_medicine | 1 | none | 0 | acc | ↑ | 0.8603 | ± | 0.0211 |
- virology | 1 | none | 0 | acc | ↑ | 0.5602 | ± | 0.0386 |
- social sciences | 2 | none | acc | ↑ | 0.8739 | ± | 0.0059 | |
- econometrics | 1 | none | 0 | acc | ↑ | 0.6667 | ± | 0.0443 |
- high_school_geography | 1 | none | 0 | acc | ↑ | 0.8939 | ± | 0.0219 |
- high_school_government_and_politics | 1 | none | 0 | acc | ↑ | 0.9585 | ± | 0.0144 |
- high_school_macroeconomics | 1 | none | 0 | acc | ↑ | 0.8103 | ± | 0.0199 |
- high_school_microeconomics | 1 | none | 0 | acc | ↑ | 0.9076 | ± | 0.0188 |
- high_school_psychology | 1 | none | 0 | acc | ↑ | 0.9358 | ± | 0.0105 |
- human_sexuality | 1 | none | 0 | acc | ↑ | 0.8855 | ± | 0.0279 |
- professional_psychology | 1 | none | 0 | acc | ↑ | 0.8578 | ± | 0.0141 |
- public_relations | 1 | none | 0 | acc | ↑ | 0.7909 | ± | 0.0390 |
- security_studies | 1 | none | 0 | acc | ↑ | 0.8327 | ± | 0.0239 |
- sociology | 1 | none | 0 | acc | ↑ | 0.9104 | ± | 0.0202 |
- us_foreign_policy | 1 | none | 0 | acc | ↑ | 0.9400 | ± | 0.0239 |
- stem | 2 | none | acc | ↑ | 0.7520 | ± | 0.0073 | |
- abstract_algebra | 1 | none | 0 | acc | ↑ | 0.4500 | ± | 0.0500 |
- anatomy | 1 | none | 0 | acc | ↑ | 0.8296 | ± | 0.0325 |
- astronomy | 1 | none | 0 | acc | ↑ | 0.9211 | ± | 0.0219 |
- college_biology | 1 | none | 0 | acc | ↑ | 0.9444 | ± | 0.0192 |
- college_chemistry | 1 | none | 0 | acc | ↑ | 0.5600 | ± | 0.0499 |
- college_computer_science | 1 | none | 0 | acc | ↑ | 0.7100 | ± | 0.0456 |
- college_mathematics | 1 | none | 0 | acc | ↑ | 0.6200 | ± | 0.0488 |
- college_physics | 1 | none | 0 | acc | ↑ | 0.6569 | ± | 0.0472 |
- computer_security | 1 | none | 0 | acc | ↑ | 0.8300 | ± | 0.0378 |
- conceptual_physics | 1 | none | 0 | acc | ↑ | 0.8213 | ± | 0.0250 |
- electrical_engineering | 1 | none | 0 | acc | ↑ | 0.7862 | ± | 0.0342 |
- elementary_mathematics | 1 | none | 0 | acc | ↑ | 0.7804 | ± | 0.0213 |
- high_school_biology | 1 | none | 0 | acc | ↑ | 0.9290 | ± | 0.0146 |
- high_school_chemistry | 1 | none | 0 | acc | ↑ | 0.7488 | ± | 0.0305 |
- high_school_computer_science | 1 | none | 0 | acc | ↑ | 0.8900 | ± | 0.0314 |
- high_school_mathematics | 1 | none | 0 | acc | ↑ | 0.5222 | ± | 0.0305 |
- high_school_physics | 1 | none | 0 | acc | ↑ | 0.6225 | ± | 0.0396 |
- high_school_statistics | 1 | none | 0 | acc | ↑ | 0.7500 | ± | 0.0295 |
- machine_learning | 1 | none | 0 | acc | ↑ | 0.6339 | ± | 0.0457 |
Groups | Version | Filter | n-shot | Metric | Value | Stderr | ||
---|---|---|---|---|---|---|---|---|
mmlu | 2 | none | acc | ↑ | 0.7725 | ± | 0.0033 | |
- humanities | 2 | none | acc | ↑ | 0.6793 | ± | 0.0062 | |
- other | 2 | none | acc | ↑ | 0.8339 | ± | 0.0064 | |
- social sciences | 2 | none | acc | ↑ | 0.8739 | ± | 0.0059 | |
- stem | 2 | none | acc | ↑ | 0.7520 | ± | 0.0073 |
- Downloads last month
- 80
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support