Model Card for Mistral-Small-3.1-24B-Base-2503 (TEXT ONLY)

This is the text-only variant of mistralai/Mistral-Small-3.1-24B-Base-2503. This also serves as the base-model for mistralai/Devstral-Small-2505, which had no official base model released.

Features:

  • Text-only, no multimodality.
  • 128k context length.

How was a text-only model achieved? The vision encoder was removed and the model architecture was converted from mistral3 to mistral. The tokenizer was not modified.

Reproduced eval

Serve with vLLM:

vllm serve casperhansen/Mistral-Small-3.1-24B-Base-2503-Text-Only

The reproduced results can be seen below.

Model MMLU (0-shot)
Small 3.1 24B Base (Text Only) 77.25% ± 0.0033
Small 3.1 24B Base (Multimodal) 77.34% ± 0.0033

Original Multimodal: Full MMLU (Reproduced)

lm_eval --model local-completions \
  --model_args "base_url=http://localhost:8000/v1/completions,model=mistralai/Mistral-Small-3.1-24B-Base-2503" \
  --tasks mmlu \
  --batch_size 128
Tasks Version Filter n-shot Metric Value Stderr
mmlu 2 none acc ↑ 0.7734 ± 0.0033
- humanities 2 none acc ↑ 0.6820 ± 0.0062
- formal_logic 1 none 0 acc ↑ 0.5714 ± 0.0443
- high_school_european_history 1 none 0 acc ↑ 0.8303 ± 0.0293
- high_school_us_history 1 none 0 acc ↑ 0.9363 ± 0.0171
- high_school_world_history 1 none 0 acc ↑ 0.9241 ± 0.0172
- international_law 1 none 0 acc ↑ 0.9091 ± 0.0262
- jurisprudence 1 none 0 acc ↑ 0.8148 ± 0.0376
- logical_fallacies 1 none 0 acc ↑ 0.8589 ± 0.0274
- moral_disputes 1 none 0 acc ↑ 0.8208 ± 0.0206
- moral_scenarios 1 none 0 acc ↑ 0.3844 ± 0.0163
- philosophy 1 none 0 acc ↑ 0.8296 ± 0.0214
- prehistory 1 none 0 acc ↑ 0.8704 ± 0.0187
- professional_law 1 none 0 acc ↑ 0.6095 ± 0.0125
- world_religions 1 none 0 acc ↑ 0.8713 ± 0.0257
- other 2 none acc ↑ 0.8317 ± 0.0064
- business_ethics 1 none 0 acc ↑ 0.8200 ± 0.0386
- clinical_knowledge 1 none 0 acc ↑ 0.8679 ± 0.0208
- college_medicine 1 none 0 acc ↑ 0.7803 ± 0.0316
- global_facts 1 none 0 acc ↑ 0.6600 ± 0.0476
- human_aging 1 none 0 acc ↑ 0.7982 ± 0.0269
- management 1 none 0 acc ↑ 0.9029 ± 0.0293
- marketing 1 none 0 acc ↑ 0.9359 ± 0.0160
- medical_genetics 1 none 0 acc ↑ 0.8900 ± 0.0314
- miscellaneous 1 none 0 acc ↑ 0.9183 ± 0.0098
- nutrition 1 none 0 acc ↑ 0.8791 ± 0.0187
- professional_accounting 1 none 0 acc ↑ 0.6277 ± 0.0288
- professional_medicine 1 none 0 acc ↑ 0.8603 ± 0.0211
- virology 1 none 0 acc ↑ 0.5602 ± 0.0386
- social sciences 2 none acc ↑ 0.8736 ± 0.0059
- econometrics 1 none 0 acc ↑ 0.6491 ± 0.0449
- high_school_geography 1 none 0 acc ↑ 0.8990 ± 0.0215
- high_school_government_and_politics 1 none 0 acc ↑ 0.9637 ± 0.0135
- high_school_macroeconomics 1 none 0 acc ↑ 0.8103 ± 0.0199
- high_school_microeconomics 1 none 0 acc ↑ 0.9034 ± 0.0192
- high_school_psychology 1 none 0 acc ↑ 0.9358 ± 0.0105
- human_sexuality 1 none 0 acc ↑ 0.8855 ± 0.0279
- professional_psychology 1 none 0 acc ↑ 0.8578 ± 0.0141
- public_relations 1 none 0 acc ↑ 0.7909 ± 0.0390
- security_studies 1 none 0 acc ↑ 0.8327 ± 0.0239
- sociology 1 none 0 acc ↑ 0.9154 ± 0.0197
- us_foreign_policy 1 none 0 acc ↑ 0.9300 ± 0.0256
- stem 2 none acc ↑ 0.7545 ± 0.0073
- abstract_algebra 1 none 0 acc ↑ 0.4600 ± 0.0501
- anatomy 1 none 0 acc ↑ 0.8148 ± 0.0336
- astronomy 1 none 0 acc ↑ 0.9211 ± 0.0219
- college_biology 1 none 0 acc ↑ 0.9444 ± 0.0192
- college_chemistry 1 none 0 acc ↑ 0.5700 ± 0.0498
- college_computer_science 1 none 0 acc ↑ 0.7100 ± 0.0456
- college_mathematics 1 none 0 acc ↑ 0.6200 ± 0.0488
- college_physics 1 none 0 acc ↑ 0.6569 ± 0.0472
- computer_security 1 none 0 acc ↑ 0.8300 ± 0.0378
- conceptual_physics 1 none 0 acc ↑ 0.8170 ± 0.0253
- electrical_engineering 1 none 0 acc ↑ 0.7931 ± 0.0338
- elementary_mathematics 1 none 0 acc ↑ 0.7910 ± 0.0209
- high_school_biology 1 none 0 acc ↑ 0.9323 ± 0.0143
- high_school_chemistry 1 none 0 acc ↑ 0.7586 ± 0.0301
- high_school_computer_science 1 none 0 acc ↑ 0.8900 ± 0.0314
- high_school_mathematics 1 none 0 acc ↑ 0.5185 ± 0.0305
- high_school_physics 1 none 0 acc ↑ 0.6291 ± 0.0394
- high_school_statistics 1 none 0 acc ↑ 0.7593 ± 0.0292
- machine_learning 1 none 0 acc ↑ 0.6250 ± 0.0460
Groups Version Filter n-shot Metric Value Stderr
mmlu 2 none acc ↑ 0.7734 ± 0.0033
- humanities 2 none acc ↑ 0.6820 ± 0.0062
- other 2 none acc ↑ 0.8317 ± 0.0064
- social sciences 2 none acc ↑ 0.8736 ± 0.0059
- stem 2 none acc ↑ 0.7545 ± 0.0073

Text Only: Full MMLU

lm_eval --model local-completions \
  --model_args "base_url=http://localhost:8000/v1/completions,model=casperhansen/Mistral-Small-3.1-24B-Base-2503-Text-Only" \
  --tasks mmlu \
  --batch_size 128
Tasks Version Filter n-shot Metric Value Stderr
mmlu 2 none acc ↑ 0.7725 ± 0.0033
- humanities 2 none acc ↑ 0.6793 ± 0.0062
- formal_logic 1 none 0 acc ↑ 0.5397 ± 0.0446
- high_school_european_history 1 none 0 acc ↑ 0.8364 ± 0.0289
- high_school_us_history 1 none 0 acc ↑ 0.9363 ± 0.0171
- high_school_world_history 1 none 0 acc ↑ 0.9198 ± 0.0177
- international_law 1 none 0 acc ↑ 0.9008 ± 0.0273
- jurisprudence 1 none 0 acc ↑ 0.8148 ± 0.0376
- logical_fallacies 1 none 0 acc ↑ 0.8405 ± 0.0288
- moral_disputes 1 none 0 acc ↑ 0.8237 ± 0.0205
- moral_scenarios 1 none 0 acc ↑ 0.3765 ± 0.0162
- philosophy 1 none 0 acc ↑ 0.8264 ± 0.0215
- prehistory 1 none 0 acc ↑ 0.8704 ± 0.0187
- professional_law 1 none 0 acc ↑ 0.6108 ± 0.0125
- world_religions 1 none 0 acc ↑ 0.8713 ± 0.0257
- other 2 none acc ↑ 0.8339 ± 0.0064
- business_ethics 1 none 0 acc ↑ 0.8300 ± 0.0378
- clinical_knowledge 1 none 0 acc ↑ 0.8679 ± 0.0208
- college_medicine 1 none 0 acc ↑ 0.7746 ± 0.0319
- global_facts 1 none 0 acc ↑ 0.6800 ± 0.0469
- human_aging 1 none 0 acc ↑ 0.8027 ± 0.0267
- management 1 none 0 acc ↑ 0.9029 ± 0.0293
- marketing 1 none 0 acc ↑ 0.9402 ± 0.0155
- medical_genetics 1 none 0 acc ↑ 0.8900 ± 0.0314
- miscellaneous 1 none 0 acc ↑ 0.9208 ± 0.0097
- nutrition 1 none 0 acc ↑ 0.8791 ± 0.0187
- professional_accounting 1 none 0 acc ↑ 0.6312 ± 0.0288
- professional_medicine 1 none 0 acc ↑ 0.8603 ± 0.0211
- virology 1 none 0 acc ↑ 0.5602 ± 0.0386
- social sciences 2 none acc ↑ 0.8739 ± 0.0059
- econometrics 1 none 0 acc ↑ 0.6667 ± 0.0443
- high_school_geography 1 none 0 acc ↑ 0.8939 ± 0.0219
- high_school_government_and_politics 1 none 0 acc ↑ 0.9585 ± 0.0144
- high_school_macroeconomics 1 none 0 acc ↑ 0.8103 ± 0.0199
- high_school_microeconomics 1 none 0 acc ↑ 0.9076 ± 0.0188
- high_school_psychology 1 none 0 acc ↑ 0.9358 ± 0.0105
- human_sexuality 1 none 0 acc ↑ 0.8855 ± 0.0279
- professional_psychology 1 none 0 acc ↑ 0.8578 ± 0.0141
- public_relations 1 none 0 acc ↑ 0.7909 ± 0.0390
- security_studies 1 none 0 acc ↑ 0.8327 ± 0.0239
- sociology 1 none 0 acc ↑ 0.9104 ± 0.0202
- us_foreign_policy 1 none 0 acc ↑ 0.9400 ± 0.0239
- stem 2 none acc ↑ 0.7520 ± 0.0073
- abstract_algebra 1 none 0 acc ↑ 0.4500 ± 0.0500
- anatomy 1 none 0 acc ↑ 0.8296 ± 0.0325
- astronomy 1 none 0 acc ↑ 0.9211 ± 0.0219
- college_biology 1 none 0 acc ↑ 0.9444 ± 0.0192
- college_chemistry 1 none 0 acc ↑ 0.5600 ± 0.0499
- college_computer_science 1 none 0 acc ↑ 0.7100 ± 0.0456
- college_mathematics 1 none 0 acc ↑ 0.6200 ± 0.0488
- college_physics 1 none 0 acc ↑ 0.6569 ± 0.0472
- computer_security 1 none 0 acc ↑ 0.8300 ± 0.0378
- conceptual_physics 1 none 0 acc ↑ 0.8213 ± 0.0250
- electrical_engineering 1 none 0 acc ↑ 0.7862 ± 0.0342
- elementary_mathematics 1 none 0 acc ↑ 0.7804 ± 0.0213
- high_school_biology 1 none 0 acc ↑ 0.9290 ± 0.0146
- high_school_chemistry 1 none 0 acc ↑ 0.7488 ± 0.0305
- high_school_computer_science 1 none 0 acc ↑ 0.8900 ± 0.0314
- high_school_mathematics 1 none 0 acc ↑ 0.5222 ± 0.0305
- high_school_physics 1 none 0 acc ↑ 0.6225 ± 0.0396
- high_school_statistics 1 none 0 acc ↑ 0.7500 ± 0.0295
- machine_learning 1 none 0 acc ↑ 0.6339 ± 0.0457
Groups Version Filter n-shot Metric Value Stderr
mmlu 2 none acc ↑ 0.7725 ± 0.0033
- humanities 2 none acc ↑ 0.6793 ± 0.0062
- other 2 none acc ↑ 0.8339 ± 0.0064
- social sciences 2 none acc ↑ 0.8739 ± 0.0059
- stem 2 none acc ↑ 0.7520 ± 0.0073
Downloads last month
80
Safetensors
Model size
23.6B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support