Tokenizer fails to load

#1
by mbanon - opened

Hi, I am trying to use this model. I tried the code you provided in the README, and I get the following error:

OSError: Can't load tokenizer for 'TurkuNLP/multilingual-web-register-classification'.
 If you were trying to load it from 'https://huggingface.co/models', make sure you don't have a local directory with the same name.
 Otherwise, make sure 'TurkuNLP/multilingual-web-register-classification' is the correct path to a directory containing all relevant files for a XLMRobertaTokenizerFast tokenizer.

The model seemed to load ok, but the tokenizer is failing.
Maybe some files are missing?
Thanks!

mbanon changed discussion title from Tokenizer failing to load to Tokenizer fails to load
TurkuNLP Research Group org

Hi, thanks for reporting this issue! The README had an error, which I've now fixed. The correct way to load the tokenizer is:

tokenizer = AutoTokenizer.from_pretrained("xlm-roberta-large")

(Rather than using the model name directly.)

Let me know if you encounter any other problems! Thanks!

Great, thanks!

mbanon changed discussion status to closed
Your need to confirm your account before you can post a new comment.

Sign up or log in to comment