Run with langchain
Can this model be used with langchain llamacpp ? If so would you be kind enough to provide code. Thanks
Yeah - install llama-cpp-python then here is a quick example:
from llama_cpp import Llama
import random
llm = Llama(model_path="/path/to/stable-vicuna-13B.ggmlv3.q5_1.bin", n_gpu_layers=40, seed=random.randint(1, 2**31))
tokens = llm.tokenize(b"### Human: Write a story about llamas\n### Assistant:")
output = b""
count = 0
for token in llm.generate(tokens, top_k=40, top_p=0.95, temp=0.72, repeat_penalty=1.1):
text = llm.detokenize([token])
print(text.decode(), end='', flush=True)
output += text
count +=1
if count >= 500 or (token == llm.token_eos()):
break
print("Full response:", output.decode())
Thanks for the code but getting a assertion error . Using llama-cpp-python == 0.1.52. Using the ggmlv3.q5_1 bin file.
assert self.ctx is not None
AssertionError
Would you know if this bin file is compatible with the package version. Thank you for your help
With langchain this https://github.com/marella/ctransformers could also be used, had issues with llama-cpp-python(asking for visual studio), but ctransformers (had libraries precompiled) helped. (I haven't tried this model with that though)
Thanks for the code but getting a assertion error . Using llama-cpp-python == 0.1.52. Using the ggmlv3.q5_1 bin file.
assert self.ctx is not None
AssertionErrorWould you know if this bin file is compatible with the package version. Thank you for your help
I had that same issue, and had to use the ggmlv2 version. I think you have to build the newer llama.cpp for the ggmlv3, but I could be wrong.
llama-cpp-python got updated to support GGMLv3 about 10 hours ago. Version 0.1.53 supports GGMLv3.
You can install llama-cpp-python 0.1.53 on Windows without compiling with: pip install https://github.com/abetlen/llama-cpp-python/releases/download/v0.1.53/llama_cpp_python-0.1.53-cp310-cp310-win_amd64.whl
Or yes use ctransformers
, which can be installed with pip install ctransformers
Thanks guys