We pre-train the BERT model using masked language modeling and next-sentence prediction tasks, but pre-training BERT from scratch is computationally expensive, so we can download and use the pre-trained BERT model. Google has open-sourced the pre-trained BERT model, and we can download it from Google Research's GitHub repository. They have released the pre-trained BERT model with various configurations, shown in the following table. L denotes the number of encoder layers, and H denotes the size of the hidden unit (representation size):

Get hands-on with 1200+ tech skills courses.