Ahmadzei's picture
update 1
57bdca5
raw
history blame
281 Bytes
The pretraining objective is to predict the masked token based on the context. This allows BERT to fully use the left and right contexts to help it learn a deeper and richer representation of the inputs. However, there was still room for improvement in BERT's pretraining strategy.