Error:Sliding Window Attention is enabled but not implemented for `sdpa`; unexpected results may be encountered

#27
by fffutr30 - opened

When loading deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B by python, it occurs the error that Sliding Window Attention is enabled but not implemented for sdpa; unexpected results may be encountered.
How to deal with it?

I solve it by installing Flash Attention

pip install flash-attn --no-build-isolation

And set it in initialization:

model = AutoModelForCausalLM.from_pretrained(
    "deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B",
    attn_implementation='flash_attention_2',
    torch_dtype=torch.bfloat16,
    device_map="auto",
)

Not sure if this doesn't cause unexpected behavior, but it works, and without warnings.

This comment has been hidden (marked as Off-Topic)
Your need to confirm your account before you can post a new comment.

Sign up or log in to comment