Ahmadzei's picture
update 1
57bdca5
raw
history blame
595 Bytes
This is what is sometimes called horizontal parallelism, as the splitting happens on horizontal level.
Learn more about Tensor Parallelism here.
token
A part of a sentence, usually a word, but can also be a subword (non-common words are often split in subwords) or a
punctuation symbol.
token Type IDs
Some models' purpose is to do classification on pairs of sentences or question answering.
These require two different sequences to be joined in a single "input_ids" entry, which usually is performed with the
help of special tokens, such as the classifier ([CLS]) and separator ([SEP]) tokens.