Ahmadzei's picture
update 1
57bdca5
raw
history blame
204 Bytes
They are token indices,
numerical representations of tokens building the sequences that will be used as input by the model.
Each tokenizer works differently but the underlying mechanism remains the same.