Fixed typo
#29 opened about 2 hours ago
by
SpiridonSunRotator

Update README.md
#28 opened about 5 hours ago
by
yonigozlan

Quantized models with vision included?
2
#27 opened about 14 hours ago
by
geoad
Corrected vllm link in readme
#26 opened about 15 hours ago
by
riversnow
Regarding Video Understanding
1
#25 opened about 15 hours ago
by
fensz
Support tool calls with chat template
#24 opened about 20 hours ago
by
CISCai
FIX for the pip install vllm --ugrade --> pip install vllm --upgrade
#23 opened about 20 hours ago
by
rbgo

How do we use it with Transformers? can you give some sample code ?
4
#22 opened about 21 hours ago
by
rameshch
Local Installation Video and Testing on Vision, Coding, Math, Text - Step by Step
#21 opened 1 day ago
by
fahdmirzac

Visual Grounding
#20 opened 1 day ago
by
Maverick17

Mistral-small
#19 opened 1 day ago
by
Melkiss
Add chat template to tokenizer config
#18 opened 1 day ago
by
mrfakename

Mistral3ForConditionalGeneration has no vLLM implementation and the Transformers implementation is not compatible with vLLM. Try setting VLLM_USE_V1=0.
1
#16 opened 1 day ago
by
pedrojfb99
set model_max_length to the maximum length of model context (131072 tokens)
#15 opened 1 day ago
by
x0wllaar
Problem with `mistral3` when loading the model
7
#14 opened 1 day ago
by
r3lativo
Add chat_template to tokenizer_config.json
1
#11 opened 1 day ago
by
bethrezen

Can't wait for HF? try chatllm.cpp
6
#7 opened 1 day ago
by
J22
You did it again...
#4 opened 2 days ago
by
MrDevolver

HF Format?
41
#2 opened 2 days ago
by
bartowski
