Inference Providers documentation
Cerebras
Cerebras
Cerebras stands alone as the world’s fastest AI inference and training platform. Organizations across fields like medical research, cryptography, energy, and agentic AI use our CS-2 and CS-3 systems to build on-premise supercomputers, while developers and enterprises everywhere can access the power of Cerebras through our pay-as-you-go cloud offerings.
Supported tasks
Chat Completion (LLM)
Find out more about Chat Completion (LLM) here.
Copied
from huggingface_hub import InferenceClient
client = InferenceClient(
provider="cerebras",
api_key="hf_xxxxxxxxxxxxxxxxxxxxxxxx",
)
completion = client.chat.completions.create(
model="meta-llama/Llama-3.3-70B-Instruct",
messages=[
{
"role": "user",
"content": "What is the capital of France?"
}
],
max_tokens=500,
)
print(completion.choices[0].message)