unsloth
/

grok-2-GGUF

Model card Files Files and versions

Learn how to run Grok 2 correctly - Read our Guide.

Unsloth Dynamic 2.0 achieves superior accuracy & outperforms other leading quants.

Grok 2 Usage Guidelines

Use --jinja for llama.cpp. You must use PR 15539. For example use the code below:
git clone https://github.com/ggml-org/llama.cpp
cd llama.cpp && git fetch origin pull/15539/head:MASTER && git checkout MASTER && cd ..

Utilizes Alvaro's Grok-2 HF compatible tokenizer as provided here

Grok 2

This repository contains the weights of Grok 2, a model trained and used at xAI in 2024.

Usage: Serving with SGLang

Download the weights. You can replace /local/grok-2 with any other folder name you prefer.
```
hf download xai-org/grok-2 --local-dir /local/grok-2
```
You might encounter some errors during the download. Please retry until the download is successful.
If the download succeeds, the folder should contain 42 files and be approximately 500 GB.
Launch a server.

Install the latest SGLang inference engine (>= v0.5.1) from https://github.com/sgl-project/sglang/

Use the command below to launch an inference server. This checkpoint is TP=8, so you will need 8 GPUs (each with > 40GB of memory).
```
python3 -m sglang.launch_server --model /local/grok-2 --tokenizer-path /local/grok-2/tokenizer.tok.json --tp 8 --quantization fp8 --attention-backend triton
```
Send a request.

This is a post-trained model, so please use the correct chat template.
```
python3 -m sglang.test.send_one --prompt "Human: What is your name?<|separator|>\n\nAssistant:"
```
You should be able to see the model output its name, Grok.

Learn more about other ways to send requests here.

License

The weights are licensed under the Grok 2 Community License Agreement.

Downloads last month: 18,123

GGUF

Model size

270B params

Architecture

grok

Hardware compatibility

Log In to view the estimation

1-bit

2-bit

3-bit

4-bit

5-bit

6-bit

8-bit

16-bit

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for unsloth/grok-2-GGUF

Base model

xai-org/grok-2

Quantized

(3)

this model