Post
2279
Managed to get my hands on a 5090FE, it's beefy
| llama 8B Q8_0 | 7.95 GiB | 8.03 B | CUDA | 99 | pp512 | 12207.44 ± 481.67 |
| llama 8B Q8_0 | 7.95 GiB | 8.03 B | CUDA | 99 | tg128 | 143.18 ± 0.18 |
Comparison with others GPUs
http://devquasar.com/gpu-gguf-inference-comparison/
| llama 8B Q8_0 | 7.95 GiB | 8.03 B | CUDA | 99 | pp512 | 12207.44 ± 481.67 |
| llama 8B Q8_0 | 7.95 GiB | 8.03 B | CUDA | 99 | tg128 | 143.18 ± 0.18 |
Comparison with others GPUs
http://devquasar.com/gpu-gguf-inference-comparison/