Critical-Future
/

Sarv-wip

Safetensors

mistral

Model card Files Files and versions

xet

Community

Daemontatox commited on May 28

Commit

cd8a4d0

verified ·

1 Parent(s): bd7b111

Update README.md

Browse files

Files changed (1) hide show

README.md +161 -157

README.md CHANGED Viewed

@@ -1,196 +1,200 @@
 ---
-library_name: transformers
-license: apache-2.0
-language:
-- en
-- bn
-- hi
-- kn
-- gu
-- mr
-- ml
-- or
-- pa
-- ta
-- te
-base_model:
-- mistralai/Mistral-Small-3.1-24B-Base-2503
-base_model_relation: finetune
 ---
-# Sarvam-M
-<p align="center">
-  <a href="https://dashboard.sarvam.ai/playground"
-     target="_blank" rel="noopener noreferrer">
-    <img
-      src="https://img.shields.io/badge/🚀 Chat on Sarvam&nbsp;Playground-1488CC?style=for-the-badge&logo=rocket"
-      alt="Chat on Sarvam Playground"
-    />
-  </a>
-</p>
-# Model Information
-`sarvam-m` is a multilingual, hybrid-reasoning, text-only language model built on Mistral-Small. This post-trained version delivers exceptional improvements over the base model:
-- +20% average improvement on Indian language benchmarks
-- +21.6% enhancement on math benchmarks
-- +17.6% boost on programming benchmarks
-Performance gains are even more impressive at the intersection of Indian languages and mathematics, with an outstanding +86% improvement in romanized Indian language GSM-8K benchmarks.
-Learn more about sarvam-m in our detailed [blog post](https://www.sarvam.ai/blogs/sarvam-m).
-# Key Features
-- **Hybrid Thinking Mode**: A single versatile model supporting both "think" and "non-think" modes. Use the think mode for complex logical reasoning, mathematical problems, and coding tasks, or switch to non-think mode for efficient, general-purpose conversation.
-- **Advanced Indic Skills**: Specifically post-trained on Indian languages alongside English, embodying a character that authentically reflects and emphasizes Indian cultural values.
-- **Superior Reasoning Capabilities**: Outperforms most similarly-sized models on coding and math benchmarks, demonstrating exceptional reasoning abilities.
-- **Seamless Chatting Experience**: Full support for both Indic scripts and romanized versions of Indian languages, providing a smooth and accessible multilingual conversation experience.
-# Quickstart
-The following code snippet demonstrates how to use `sarvam-m` using Transformers.
-```python
-from transformers import AutoModelForCausalLM, AutoTokenizer
-model_name = "sarvamai/sarvam-m"
-# load the tokenizer and the model
-tokenizer = AutoTokenizer.from_pretrained(model_name)
-model = AutoModelForCausalLM.from_pretrained(
-    model_name, torch_dtype="auto", device_map="auto"
-)
-# prepare the model input
-prompt = "Who are you and what is your purpose on this planet?"
-messages = [{"role": "user", "content": prompt}]
-text = tokenizer.apply_chat_template(
-    messages,
-    tokenize=False,
-    enable_thinking=True,  # Switches between thinking and non-thinking modes. Default is True.
-)
-model_inputs = tokenizer([text], return_tensors="pt").to(model.device)
-# conduct text completion
-generated_ids = model.generate(**model_inputs, max_new_tokens=8192)
-output_ids = generated_ids[0][len(model_inputs.input_ids[0]) :].tolist()
-output_text = tokenizer.decode(output_ids)
-if "</think>" in output_text:
-    reasoning_content = output_text.split("</think>")[0].rstrip("\n")
-    content = output_text.split("</think>")[-1].lstrip("\n").rstrip("</s>")
-else:
-    reasoning_content = ""
-    content = output_text.rstrip("</s>")
-print("reasoning content:", reasoning_content)
-print("content:", content)
-```
-> [!NOTE]
-> For thinking mode, we recommend `temperature=0.5`; for no-think mode, `temperature=0.2`.
-# With Sarvam APIs
-```python
-from openai import OpenAI
-base_url = "https://api.sarvam.ai/v1"
-model_name = "sarvam-m"
-api_key = "Your-API-Key"  # get it from https://dashboard.sarvam.ai/
-client = OpenAI(
-    base_url=base_url,
-    api_key=api_key,
-).with_options(max_retries=1)
-messages = [
-    {"role": "system", "content": "You're a helpful AI assistant"},
-    {"role": "user", "content": "Explain quantum computing in simple terms"},
-]
-response1 = client.chat.completions.create(
-    model=model_name,
-    messages=messages,
-    reasoning_effort="medium",  # Enable thinking mode. `None` for disable.
-    max_completion_tokens=4096,
-)
-print("First response:", response1.choices[0].message.content)
-# Building messages for the second turn (using previous response as context)
-messages.extend(
-    [
-        {
-            "role": "assistant",
-            "content": response1.choices[0].message.content,
-        },
-        {"role": "user", "content": "Can you give an analogy for superposition?"},
-    ]
-)
-response2 = client.chat.completions.create(
-    model=model_name,
-    messages=messages,
-    reasoning_effort="medium",
-    max_completion_tokens=8192,
-)
-print("Follow-up response:", response2.choices[0].message.content)
-```
-Refer to API docs here: [sarvam Chat Completions API docs](https://docs.sarvam.ai/api-reference-docs/chat/completions)
-`reasoning_effort` can take three possible values: `low`, `medium`, and `high` to be consistent with the OpenAI API spec. Setting any of the three values just enables the thinking mode of sarvam-m.
-# VLLM Deployment
-For easy deployment, we can use `vllm>=0.8.5` and create an OpenAI-compatible API endpoint with `vllm serve sarvamai/sarvam-m`.
-If you want to use vLLM with python, you can do the following.
-```python
-from openai import OpenAI
-# Modify OpenAI's API key and API base to use vLLM's API server.
-openai_api_key = "EMPTY"
-openai_api_base = "http://localhost:8000/v1"
-client = OpenAI(
-    api_key=openai_api_key,
-    base_url=openai_api_base,
-)
-models = client.models.list()
-model = models.data[0].id
-messages = [{"role": "user", "content": "Why is 42 the best number?"}]
-# By default, thinking mode is enabled.
-# If you want to disable thinking, add:
-# extra_body={"chat_template_kwargs": {"enable_thinking": False}}
-response = client.chat.completions.create(model=model, messages=messages)
-output_text = response.choices[0].message.content
-if "</think>" in output_text:
-    reasoning_content = output_text.split("</think>")[0].rstrip("\n")
-    content = output_text.split("</think>")[-1].lstrip("\n")
-else:
-    reasoning_content = ""
-    content = output_text
-print("reasoning content:", reasoning_content)
-print("content:", content)
-# For the next round, add the model's response directly as assistant turn.
-messages.append(
-    {"role": "assistant", "content": output_text}
-)
-```

 ---
+# For reference on model card metadata, see the spec: https://github.com/huggingface/hub-docs/blob/main/modelcard.md?plain=1
+# Doc / guide: https://huggingface.co/docs/hub/model-cards
+{}
 ---
+# Model Card for Model ID
+<!-- Provide a quick summary of what the model is/does. -->
+This modelcard aims to be a base template for new models. It has been generated using [this raw template](https://github.com/huggingface/huggingface_hub/blob/main/src/huggingface_hub/templates/modelcard_template.md?plain=1).
+## Model Details
+### Model Description
+<!-- Provide a longer summary of what this model is. -->
+- **Developed by:** [More Information Needed]
+- **Funded by [optional]:** [More Information Needed]
+- **Shared by [optional]:** [More Information Needed]
+- **Model type:** [More Information Needed]
+- **Language(s) (NLP):** [More Information Needed]
+- **License:** [More Information Needed]
+- **Finetuned from model [optional]:** [More Information Needed]
+### Model Sources [optional]
+<!-- Provide the basic links for the model. -->
+- **Repository:** [More Information Needed]
+- **Paper [optional]:** [More Information Needed]
+- **Demo [optional]:** [More Information Needed]
+## Uses
+<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
+### Direct Use
+<!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
+[More Information Needed]
+### Downstream Use [optional]
+<!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
+[More Information Needed]
+### Out-of-Scope Use
+<!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
+[More Information Needed]
+## Bias, Risks, and Limitations
+<!-- This section is meant to convey both technical and sociotechnical limitations. -->
+[More Information Needed]
+### Recommendations
+<!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
+Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
+## How to Get Started with the Model
+Use the code below to get started with the model.
+[More Information Needed]
+## Training Details
+### Training Data
+<!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
+[More Information Needed]
+### Training Procedure
+<!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
+#### Preprocessing [optional]
+[More Information Needed]
+#### Training Hyperparameters
+- **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
+#### Speeds, Sizes, Times [optional]
+<!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
+[More Information Needed]
+## Evaluation
+<!-- This section describes the evaluation protocols and provides the results. -->
+### Testing Data, Factors & Metrics
+#### Testing Data
+<!-- This should link to a Dataset Card if possible. -->
+[More Information Needed]
+#### Factors
+<!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->
+[More Information Needed]
+#### Metrics
+<!-- These are the evaluation metrics being used, ideally with a description of why. -->
+[More Information Needed]
+### Results
+[More Information Needed]
+#### Summary
+## Model Examination [optional]
+<!-- Relevant interpretability work for the model goes here -->
+[More Information Needed]
+## Environmental Impact
+<!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->
+Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
+- **Hardware Type:** [More Information Needed]
+- **Hours used:** [More Information Needed]
+- **Cloud Provider:** [More Information Needed]
+- **Compute Region:** [More Information Needed]
+- **Carbon Emitted:** [More Information Needed]
+## Technical Specifications [optional]
+### Model Architecture and Objective
+[More Information Needed]
+### Compute Infrastructure
+[More Information Needed]
+#### Hardware
+[More Information Needed]
+#### Software
+[More Information Needed]
+## Citation [optional]
+<!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
+**BibTeX:**
+[More Information Needed]
+**APA:**
+[More Information Needed]
+## Glossary [optional]
+<!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. -->
+[More Information Needed]
+## More Information [optional]
+[More Information Needed]
+## Model Card Authors [optional]
+[More Information Needed]
+## Model Card Contact
+[More Information Needed]