vllm

Any plans for some tiny models? (<4b)

#2
by phly95 - opened

Mistral models are great, but they are unfortunately lacking models <4b. Options like Qwen 2.5, Gemma 2 2b, and Llama 3.2 3b and 1b exist, but I feel like having a Mistral model in that area would really make deploying local llm powered apps a lot easier, especially if deploying to basic laptops in a workplace (good luck convincing IT to deploy Nvidia laptops to an entire company).

Sign up or log in to comment