meituan
/

DeepSeek-R1-Block-INT8

yuanzu commited on Feb 24

Commit

277034a

verified ·

1 Parent(s): 0d7d596

Update README.md (#1)

Files changed (1) hide show

README.md CHANGED Viewed

@@ -46,6 +46,10 @@ library_name: transformers
   <a href="https://github.com/deepseek-ai/DeepSeek-R1/blob/main/DeepSeek_R1.pdf"><b>Paper Link</b>👁️</a>
 </p>
 ## 1. Introduction

   <a href="https://github.com/deepseek-ai/DeepSeek-R1/blob/main/DeepSeek_R1.pdf"><b>Paper Link</b>👁️</a>
 </p>
+## 0. INT8 Quantization
+We apply a INT8 quantization on the BF16 checkpoints, where weight scales are determined by dividing he block-wise maximum of element values by the INT8 type maximum.
+The quantization script is provided in inference/bf16_case_int8.py.
 ## 1. Introduction