Better & Faster Large Language Models via Multi-token Prediction Paper • 2404.19737 • Published Apr 30, 2024 • 78
Saiga GGUF Collection Russian fine-tunes of different base LLMs in the GGUF format compatible with llama.cpp • 8 items • Updated Apr 27 • 28