Ahmadzei's picture
update 1
57bdca5
raw
history blame
389 Bytes
In such models,
passing the labels is the preferred way to handle training.
Please check each model's docs to see how they handle these input IDs for sequence to sequence training.
decoder models
Also referred to as autoregressive models, decoder models involve a pretraining task (called causal language modeling) where the model reads the texts in order and has to predict the next word.