bullerwins
/

Devstral-Small-2505-fp8

Text Generation

text2text-generation

compressed-tensors

Model card Files Files and versions

bullerwins commited on May 21

Commit

a17aa40

·

verified ·

1 Parent(s): 206b946

Update README.md

Files changed (1) hide show

README.md +7 -0

README.md CHANGED Viewed

@@ -35,6 +35,13 @@ extra_gated_description: >-
 pipeline_tag: text2text-generation
 ---
 # Devstral-Small-2505
 Devstral is an agentic LLM for software engineering tasks built under a collaboration between [Mistral AI](https://mistral.ai/) and [All Hands AI](https://www.all-hands.dev/) 🙌. Devstral excels at using tools to explore codebases, editing multiple files and power software engineering agents. The model achieves remarkable performance on SWE-bench which positionates it as the #1 open source model on this [benchmark](#benchmark-results).

 pipeline_tag: text2text-generation
 ---
+Quantized to FP8 with [LLMCompressor](https://github.com/vllm-project/llm-compressor)
+Ideal to run on a dual GPU system like 2x3090 with vLLM or SGlang:
+`vllm serve bullerwins/Devstral-Small-2505-fp8 --max-model-len 16000 --host 0.0.0.0 --port 5000 -tp 2 --tokenizer_mode mistral`
 # Devstral-Small-2505
 Devstral is an agentic LLM for software engineering tasks built under a collaboration between [Mistral AI](https://mistral.ai/) and [All Hands AI](https://www.all-hands.dev/) 🙌. Devstral excels at using tools to explore codebases, editing multiple files and power software engineering agents. The model achieves remarkable performance on SWE-bench which positionates it as the #1 open source model on this [benchmark](#benchmark-results).