Is flash-attention-2 suppported
#8
by
Jack7777777
- opened
See it been commented out now: https://huggingface.co/Alibaba-NLP/new-impl/blob/main/modeling.py#L592
Hi, xformers
could dispatch to flash-attention-2 kernel. Hence I commented out this extra entrance.