sarvam-m-onnx / README.md
Prince-1's picture
Update README.md
a20ba29 verified
|
raw
history blame
2.39 kB
metadata
library_name: onnxruntime_genai
license: apache-2.0
language:
  - en
  - bn
  - hi
  - kn
  - gu
  - mr
  - ml
  - or
  - pa
  - ta
  - te
tags:
  - mistral3
  - indic
  - onnx
  - onnxruntime-genai
  - sarvam
  - text-generation-inference
  - cuda
base_model_relation: quantized
base_model:
  - sarvamai/sarvam-m

Sarvam-M

Chat on Sarvam Playground

Model Information

sarvam-m is a multilingual, hybrid-reasoning, text-only language model built on Mistral-Small. This post-trained version delivers exceptional improvements over the base model:

  • +20% average improvement on Indian language benchmarks
  • +21.6% enhancement on math benchmarks
  • +17.6% boost on programming benchmarks

Performance gains are even more impressive at the intersection of Indian languages and mathematics, with an outstanding +86% improvement in romanized Indian language GSM-8K benchmarks.

Learn more about sarvam-m in our detailed blog post.

Key Features

  • Hybrid Thinking Mode: A single versatile model supporting both "think" and "non-think" modes. Use the think mode for complex logical reasoning, mathematical problems, and coding tasks, or switch to non-think mode for efficient, general-purpose conversation.

  • Advanced Indic Skills: Specifically post-trained on Indian languages alongside English, embodying a character that authentically reflects and emphasizes Indian cultural values.

  • Superior Reasoning Capabilities: Outperforms most similarly-sized models on coding and math benchmarks, demonstrating exceptional reasoning abilities.

  • Seamless Chatting Experience: Full support for both Indic scripts and romanized versions of Indian languages, providing a smooth and accessible multilingual conversation experience.

Convertion

The original model is converted to onnx using OnnxRuntime-GenAI develop by Microsoft .

Quickstart

The following code snippet demonstrates how to use sarvam-m using Onnx.


For thinking mode, we recommend temperature=0.5; for no-think mode, temperature=0.2.