No triton for windows
#4
by
fernandomir
- opened
Any workaround? Anyways, hwo much VRAM to run the microsoft/Phi-3-small-8k-instruct model?
Thanks!
Has a solution been found yet?
Avoid using this model, it's extremly slow
Almost the slowest 7B model I have ever seen.
Am tested on same A100, compare with Qwen2-7B, no quante, just pure compare with raw transformers.
It's just extrem slow...
nguyenbh
changed discussion status to
closed