How can I run it with 400MB memory as was claimed in the model card ?
What are the quantisation changes to be made to make it happen ?
build bitnet.cpp
I have done that. But, don't I need a smaller model file ?
· Sign up or log in to comment