[Add] `tokenizer_class` in config to make it usable by the `pipeline` API
#10
by
ariG23498
HF Staff
- opened
Hey team!
Thanks for open sourcing this model.
I have added the tokenizer_class
in the configuration so that we can make it compatible with the pipeline
API as well. With the changes in place you would be able to use the model like so:
from transformers.pipelines import pipeline
import torch
messages = [
{"role": "user", "content": "Who are you?"},
]
pipe = pipeline(
"text-generation",
model="XiaomiMiMo/MiMo-7B-RL",
trust_remote_code=True,
torch_dtype=torch.bfloat16,
device_map="auto"
)
print(pipe(messages))
bwshen-mi
changed pull request title from
Adding tokenizer class for the `pipeline` API to use the tokenizer
to [Add] `tokenizer_class` in config to make it usable by the `pipeline` API
bwshen-mi
changed pull request status to
merged
@ariG23498 Thank you for your efforts. I have a small question here.
In tokenizer_config.json
, the value of tokenizer_class
is a str "Qwen2Tokenizer"
, but here in config.json
, it's a list[str]. Do we need to use same value type of tokenizer_class
?
I don't think there is a need. But if there is any issue that you find do let me know and I will investigate further.