Help with clarifying something please - Can i run this on a single 4090?

by Restrected - opened Nov 28, 2024

Nov 28, 2024

I have two 4090s in a epyc 32 core server, but i see alot of smaller models actually perform impresively in very little hardware (relatively), like my m3 max macbook pro. It runs 7B models beautifully.

I am trying to install a few LLMs that are running concurently, so im running them on the server with a webpage for access, but i need to figure out how to run bigger models like this in a single card. Anyone can direct me somewhere?

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment