Download the llamafile
- Download the llamafile from https://huggingface.co/avilum/llamafile-python-openai-template/blob/main/TinyLlama-1.1B.llamafile
- Use the download button.
Run the server
chmod +x TinyLlama-1.1B.llamafile
./TinyLlama-1.1B.llamafile --server --host 0.0.0.0 --port 1234
Use the LLM with OpenAI SDK:
from openai import OpenAI
client = OpenAI(base_url="http://127.0.0.1:1234/v1", api_key="test")
# Prompt
prompt = "Hi, tell me something new about AppSec"
# Send API request to llamafile server
stream = client.chat.completions.create(
model="avi-llmsky",
messages=[{"role": "user", "content": prompt}],
stream=True,
)
# Print the responses
for chunk in stream:
if chunk.choices[0].delta.content is not None:
print(chunk.choices[0].delta.content, end="")
- Downloads last month
- 4
Inference Providers
NEW
This model is not currently available via any of the supported third-party Inference Providers, and
HF Inference API was unable to determine this model's library.