Spaces:

Ahmadzei
/

RAG

Runtime error

update 1

57bdca5 over 1 year ago

436 Bytes

	Here's an example using the BERT
	tokenizer, which is a WordPiece tokenizer:
	thon

	from transformers import BertTokenizer
	tokenizer = BertTokenizer.from_pretrained("google-bert/bert-base-cased")
	sequence = "A Titan RTX has 24GB of VRAM"

	The tokenizer takes care of splitting the sequence into tokens available in the tokenizer vocabulary.
	thon

	tokenized_sequence = tokenizer.tokenize(sequence)

	The tokens are either words or subwords.