Intel
/

DeepSeek-V3.1-int4-AutoRound

@@ -10,7 +10,87 @@ This model is a int4 model with group_size 128 and symmetric quantization of [de
 Please follow the license of the original model.
 ## How To Use
 ### Generate the model

 Please follow the license of the original model.
 ## How To Use
+### INT4 Inference
+```python
+from transformers import AutoModelForCausalLM, AutoTokenizer
+import transformers
+import torch
+quantized_model_dir = "Intel/DeepSeek-V3.1-int4-mixed-AutoRound"
+model = AutoModelForCausalLM.from_pretrained(
+        quantized_model_dir,
+        torch_dtype=torch.bfloat16,
+        device_map="auto",
+)
+tokenizer = AutoTokenizer.from_pretrained(quantized_model_dir, trust_remote_code=True)
+prompts = [
+        "strawberry中有几个r?",
+        "There is a girl who likes adventure,",
+        "Please give a brief introduction of DeepSeek company.",
+        ]
+texts=[]
+for prompt in prompts:
+    messages = [
+            {"role": "system", "content": "You are a helpful assistant."},
+            {"role": "user", "content": prompt}
+    ]
+    text = tokenizer.apply_chat_template(
+            messages,
+            tokenize=False,
+            add_generation_prompt=True
+            )
+    texts.append(text)
+inputs = tokenizer(texts, return_tensors="pt", padding=True, truncation=True)
+outputs = model.generate(
+        input_ids=inputs["input_ids"].to(model.device),
+        attention_mask=inputs["attention_mask"].to(model.device),
+        max_length=200, ##change this to align with the official usage
+        num_return_sequences=1,
+        do_sample=False  ##change this to align with the official usage
+        )
+generated_ids = [
+        output_ids[len(input_ids):] for input_ids, output_ids in zip(inputs["input_ids"], outputs)
+        ]
+decoded_outputs = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)
+for i, prompt in enumerate(prompts):
+    input_id = inputs
+    print(f"Prompt: {prompt}")
+    print(f"Generated: {decoded_outputs[i]}")
+"""
+Prompt: strawberry中有几个r?
+Generated: 在英文单词 "strawberry" 中，字母 "r" 出现了 **3 次**。
+- 位置：第 3 个字母（s**t r**awberry）、第 6 个字母（stra**w b**erry 中的 "r" 实际是第 6 个字符，但注意 "w" 后是 "b"，这里需要仔细数）
+实际上：
+- 分解：s-t-r-a-w-b-e-r-r-y
+- 字母 "r" 出现在第 3、第 8 和第 9 位（索引从 1 开始）。
+所以，**"strawberry" 包含 3 个 "r"**。
+--------------------------------------------------
+Prompt: There is a girl who likes adventure,
+Generated: Of course! Here are a few ways to imagine what that could look like, from a simple story to a character profile.
+### A Short Story Snippet
+The map was old, the edges frayed and the ink faded in places. Ella traced the route with her finger for the hundredth time, her heart beating a rhythm of pure excitement. It wasn't just a path to a hidden waterfall; it was a path to *discovery*.
+She packed her bag not with fancy clothes, but with a well-worn compass, a rope, a water bottle, and her trusted journal. The forest welcomed her with the smell of damp earth and pine. Every rustle in the undergrowth was a mystery, every unfamiliar bird call a secret she was determined to learn.
+As she reached the cliff face she needed to climb, a thrill, not fear, shot through her. She
+--------------------------------------------------
+Prompt: Please give a brief introduction of DeepSeek company.
+Generated: Of course. Here is a brief introduction to DeepSeek.
+**DeepSeek** is a leading Chinese AI research company focused on developing powerful artificial intelligence models, with a primary emphasis on large language models (LLMs) and multimodal systems.
+Here are the key points about the company:
+*   **Core Focus:** They are best known for their **DeepSeek-V2** and the more recent **DeepSeek-V3** models, which are highly capable LLMs that compete with other top-tier models like GPT-4. They specialize in both closed and open-source AI.
+*   **Open-Source Contribution:** DeepSeak has made significant contributions to the open-source community. They have released powerful models like **DeepSeek-Coder** (focused on code generation and programming tasks) and the weights for earlier versions of their LLMs, allowing developers and researchers worldwide
+--------------------------------------------------
+"""
 ### Generate the model