GGUF Model, Can I split it?

Based on NyxKrage's LLM VRAM calculator

Max Allocated VRAM

Model (unquantized)

Context Size

Context offloaded to

VRAM

RAM

Quantization Size

Batch Size

Model Size (GB)

4.20

Context Size (GB)

6.90

Total Size (GB)

420.69

Layer size (GB)

42.69

Layers offloaded to GPU (out of total)

42