library_name: onnxruntime_genai
license: apache-2.0
language:
- en
- bn
- hi
- kn
- gu
- mr
- ml
- or
- pa
- ta
- te
tags:
- mistral3
- indic
- onnx
- onnxruntime-genai
- sarvam
- text-generation-inference
- cuda
base_model_relation: quantized
base_model:
- sarvamai/sarvam-m
Sarvam-M
Model Information
sarvam-m
is a multilingual, hybrid-reasoning, text-only language model built on Mistral-Small. This post-trained version delivers exceptional improvements over the base model:
- +20% average improvement on Indian language benchmarks
- +21.6% enhancement on math benchmarks
- +17.6% boost on programming benchmarks
Performance gains are even more impressive at the intersection of Indian languages and mathematics, with an outstanding +86% improvement in romanized Indian language GSM-8K benchmarks.
Learn more about sarvam-m in our detailed blog post.
Key Features
Hybrid Thinking Mode: A single versatile model supporting both "think" and "non-think" modes. Use the think mode for complex logical reasoning, mathematical problems, and coding tasks, or switch to non-think mode for efficient, general-purpose conversation.
Advanced Indic Skills: Specifically post-trained on Indian languages alongside English, embodying a character that authentically reflects and emphasizes Indian cultural values.
Superior Reasoning Capabilities: Outperforms most similarly-sized models on coding and math benchmarks, demonstrating exceptional reasoning abilities.
Seamless Chatting Experience: Full support for both Indic scripts and romanized versions of Indian languages, providing a smooth and accessible multilingual conversation experience.
Convertion
The original model is converted to onnx
using OnnxRuntime-GenAI develop by Microsoft .
Quickstart
The following code snippet demonstrates how to use sarvam-m
using Onnx.
For thinking mode, we recommend
temperature=0.5
; for no-think mode,temperature=0.2
.