How much cuda memory to use

by robindon - opened Jan 10

Jan 10

How much cuda memory is used on the corresponding nvidia card

czczup

OpenGVLab org 1 day ago

Hello, if you use bf16 to load the 78B model, the model weights will consume approximately 78 GB * 2 = 156 GB.

On the other hand, if you use lmdeploy to load the AWQ quantized version of the model, the weights will consume approximately 78 GB * 0.5 = 39 GB.

czczup changed discussion status to closed 1 day ago

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

Your need to confirm your account before you can post a new comment.

· Sign up or log in to comment