Configurations of BERT
Learn about some BERT configurations.
We'll cover the following...
We'll cover the following...
Standard configurations of BERT
The researchers of BERT have presented the model in two standard configurations:
BERT-base
BERT-large
Let's take a look at each of these in detail.
BERT-base
BERT-base consists of 12 encoder layers, each stacked one on top of the other. All the encoders use 12 attention heads. The feedforward network in the ...