T5 is a more unique model that casts all NLP tasks into a text-to-text problem using specific prefixes. For example, the prefix Summarize: indicates a summarization task. T5 is pretrained by supervised (GLUE and SuperGLUE) training and self-supervised training (randomly sample and drop out 15% of tokens). | |
Audio | |
Encoder[[audio-encoder]] | |
Wav2Vec2 uses a Transformer encoder to learn speech representations directly from raw audio waveforms. |