Uh, not sure if im trying to use this wrong, but it seems to use so much RAM that my system just freezes

#1
by question2005 - opened

So i downloaded this to try and write stories, and upon loading the model in LM studio, it starts using an absurd amount of RAM and doesnt try to stop. LM studio shows it starts using 14+gb of RAM. I have 32GB of RAM but my system starts freezing when too much is used.

Am i doing something wrong? I cant even enter a prompt because merely loading the model causes my RAM usage to spike to crazy levels.

Owner

This model uses Llama 2- a much older arch, and uses a lot of ram/vram + a lot of ram/Vram for context too.

If you are looking for CPU based models (that is you can run on CPU with good speed) ;
You need 1B to 14B at the most - Llama 3, 3.1, Qwens, Gemmas etc etc.

Dark Planet 8B is a good one.

You can also use MOE models too - as only a fraction of the model is active.

Dark Champion - Llama 3.2 Moe I built.
https://huggingface.co/DavidAU/Llama-3.2-8X3B-MOE-Dark-Champion-Instruct-uncensored-abliterated-18.4B-GGUF

See Qwen 30B-A3B -> this is very fast, even on CPU, and low memory foot print.

You can also visit the main repo page:
https://huggingface.co/DavidAU/

Models are categorized here + you can search the entire repo here too.

Sign up or log in to comment