Update - Tool Calling + Chat Template bug fixes

#20

pinned

by danielhanchen - opened 9 days ago

Discussion

danielhanchen

Unsloth AI org 9 days ago

Just updated DeepSeek-R1-0528 GGUFs and BF16 safetensors (the big 671B model)

Native tool calling is now supported. Uses https://github.com/sgl-project/sglang/pull/6765 and https://github.com/vllm-project/vllm/pull/18874 which shows DeepSeek-R1 getting 93.25% on the BFCL** Berkeley Function-Calling Leaderboard https://gorilla.cs.berkeley.edu/leaderboard.html.
Use it via --jinja in llama.cpp. Native transformers and vLLM should work as well.
Had to fix multiple issues in SGLang and vLLM's PRs (dangling newlines etc)
Chat template bug fixes add_generation_prompt now works - previously <|Assistant|> was auto appended - now it's toggle-able. Fixes many issues, and should streamline chat sessions.
UTF-8 encoding of tokenizer_config.json is now fixed - now works in Windows.
Ollama is now fixed on using more memory - I removed num_ctx and num_predict -> it'll now default to Ollama's defaults. This allocated more KV cache VRAM, thus spiking VRAM usage. Please update your context length manually.
[10th June 2025] Update - LM Studio now also works
Ollama works by using the TQ1_0 quant!

ollama run hf.co/unsloth/DeepSeek-R1-0528-GGUF:TQ1_0

Please re-download all weights to get the latest updates!

danielhanchen pinned discussion 9 days ago

Rotating

9 days ago

•

edited 9 days ago

What is 3. about? I think I can ignore all the other ones and not re-download.

ciprianv

6 days ago

Why UD-Q2_XL was deleted? Is UD-IQ2_M better?

shimmyshimmer

Unsloth AI org 5 days ago

What is 3. about? I think I can ignore all the other ones and not re-download.

It's not that important

Why UD-Q2_XL was deleted? Is UD-IQ2_M better?

Oh crap you're right, it was never supposed to be deleted lol thanks for the warning

danielhanchen

Unsloth AI org 5 days ago

I also noticed Q8_0 was gone!! I'll redo Q8_0 and Q2_K_XL

danielhanchen

Unsloth AI org 5 days ago

@ciprianv Q2_K_XL and Q8_0 are back - unsure why it got removed sorry!

ciprianv

5 days ago

Thank you!

nulled

4 days ago

Why is DeepSeek-R1-0528-UD-IQ2_M-00001-of-00005.gguf much newer than the rest of its parts? Are all the files updated (as mentioned above, or just the first one?

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment