How much cuda memory to use
#1
by
robindon
- opened
How much cuda memory is used on the corresponding nvidia card
Hello, if you use bf16 to load the 78B model, the model weights will consume approximately 78 GB * 2 = 156 GB.
On the other hand, if you use lmdeploy to load the AWQ quantized version of the model, the weights will consume approximately 78 GB * 0.5 = 39 GB.
czczup
changed discussion status to
closed