No triton for windows

by fernandomir - opened May 23, 2024

May 23, 2024

Any workaround? Anyways, hwo much VRAM to run the microsoft/Phi-3-small-8k-instruct model?
Thanks!

mk1024

May 29, 2024

Has a solution been found yet?

Jul 4, 2024

Avoid using this model, it's extremly slow

Almost the slowest 7B model I have ever seen.

Am tested on same A100, compare with Qwen2-7B, no quante, just pure compare with raw transformers.

It's just extrem slow...

nguyenbh changed discussion status to closed Aug 30, 2024

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment