djstrong commited on
Commit
334e05c
·
verified ·
1 Parent(s): 465f95c

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -0
README.md CHANGED
@@ -21,6 +21,8 @@ This model was obtained by quantizing the weights and activations of [Bielik-1.5
21
  AutoFP8 is used for quantization. This optimization reduces the number of bits per parameter from 16 to 8, reducing the disk size and GPU memory requirements by approximately 50%.
22
  Only the weights and activations of the linear operators within transformers blocks are quantized. Symmetric per-tensor quantization is applied, in which a single linear scaling maps the FP8 representations of the quantized weights and activations.
23
 
 
 
24
  FP8 compuation is supported on Nvidia GPUs with compute capability > 8.9 (Ada Lovelace, Hopper).
25
 
26
  **DISCLAIMER: Be aware that quantised models show reduced response quality and possible hallucinations!**
 
21
  AutoFP8 is used for quantization. This optimization reduces the number of bits per parameter from 16 to 8, reducing the disk size and GPU memory requirements by approximately 50%.
22
  Only the weights and activations of the linear operators within transformers blocks are quantized. Symmetric per-tensor quantization is applied, in which a single linear scaling maps the FP8 representations of the quantized weights and activations.
23
 
24
+ 📚 Technical report: https://arxiv.org/abs/2505.02550
25
+
26
  FP8 compuation is supported on Nvidia GPUs with compute capability > 8.9 (Ada Lovelace, Hopper).
27
 
28
  **DISCLAIMER: Be aware that quantised models show reduced response quality and possible hallucinations!**