Request for tokenizer.model File for DeepSeek-qwen-Bllossom-32B

#1
by WOOYT99 - opened

Hello,

I’m currently working with the model DeepSeek-qwen-Bllossom-32B, which appears to be based on the Qwen tokenizer architecture. However, the tokenizer.model file is missing in the current distribution, and I’m encountering errors when trying to load the tokenizer with use_fast=False.

Could you please provide the corresponding tokenizer.model file or let me know the best source from which I can obtain a compatible version?

This is critical for enabling sentencepiece-based tokenization in a local environment where fast tokenizers are not applicable.

Thank you very much in advance!

UNIVA and KAIST-MLP lab org

Hello,

I’ve now added the merges.txt and vocab.json files to the DeepSeek-qwen-Bllossom-32B release, so you can load the tokenizer with use_fast=False without any errors

Please let me know if you run into any further issues!

Thank you.

Sign up or log in to comment