Model's Resource Utilization

#230
by kalashshah19 - opened

Let's discuss resource utilization, a major concern for individual devs

How much VRAM does it use ?

The model in this repo without any quantization or customization will be around 1.2 TB of VRAM (probably 16 NVIDIA H100 GPUs will work). But this depends on the amount of cache and the size of the context that you will configure that will may require more VRAM to accomodate the cache. Consider maybe 1.5TB of VRAM for a comfortable scenario.

Ohh thanks for the info mate :)

kalashshah19 changed discussion title from How much VRAM does it use ? to Model's Resource Utilization

Sign up or log in to comment