Ahmadzei's picture
update 1
57bdca5
raw
history blame
274 Bytes
The processing is done in parallel and all setups are synchronized at the end of each training step.
Learn more about how DataParallel works here.
decoder input IDs
This input is specific to encoder-decoder models, and contains the input IDs that will be fed to the decoder.