make xformers an optional dependency

#6

This adapts the LlamaMLP from the llama modeling code in transformers to handle splitting the w12 weight during the forward pass, and uses it in case xformers is not available on the system.

This enables the model to be used on MacOS for example.

Isnt working

Isnt working

You're right, I just realized now that, I accidentally broke it when I was cleaning up the code, and I had been using an older copy locally until now that was working properly.

Ready to merge
This branch is ready to get merged automatically.
Your need to confirm your account before you can post a new comment.

Sign up or log in to comment