14:[["$","$L136",null,{"props":{"lessonContent":{"components":[{"type":"SlateHTML","content":{"html":"

Standard configurations of BERT

The researchers of BERT have presented the model in two standard configurations:

BERT-base
BERT-large

Let's take a look at each of these in detail.

BERT-base

BERT-base consists of 12 encoder layers, each stacked one on top of the other. All the encoders use 12 attention heads. The feedforward network in the ...

","comp_id":"x17Xf-5e4LtHUO3kIHg5v"},"hash":0,"iteration":0}],"summary":{"title":"Configurations of BERT","description":"Learn about some BERT configurations.","titleUpdated":true},"content":[{"type":"SlateHTML","content":{"html":"

Standard configurations of BERT

The researchers of BERT have presented the model in two standard configurations:

BERT-base
BERT-large

Let's take a look at each of these in detail.

BERT-base

BERT-base consists of 12 encoder layers, each stacked one on top of the other. All the encoders use 12 attention heads. The feedforward network in the ...

","comp_id":"x17Xf-5e4LtHUO3kIHg5v"},"hash":0,"iteration":0}],"darkModeContent":[{"type":"SlateHTML","content":{"html":"

Standard configurations of BERT

The researchers of BERT have presented the model in two standard configurations:

BERT-base
BERT-large

Let's take a look at each of these in detail.

BERT-base

BERT-base consists of 12 encoder layers, each stacked one on top of the other. All the encoders use 12 attention heads. The feedforward network in the ...

","comp_id":"x17Xf-5e4LtHUO3kIHg5v"},"hash":0,"iteration":0}]},"isPreviewLesson":false,"pageType":"collection_lesson","aiCoachVideoUrl":"https://youtu.be/kgl8y9J3O6c","collectionDetailsSSR":{"title":"Getting Started with Google BERT","summary":"This comprehensive course dives into Google’s BERT architecture, exploring its revolutionary role in natural language processing (NLP). Starting with BERT’s architecture and pre-training methods, you’ll uncover the mechanics of transformers, including encoder-decoder components and self-attention mechanisms. Gain hands-on experience fine-tuning BERT for NLP tasks like sentiment analysis, question-answering, and named entity recognition.\n\nDiscover BERT variants such as ALBERT, RoBERTa, and DistilBERT alongside domain-specific adaptations like ClinicalBERT and BioBERT. Explore applications in text summarization, multilingual tasks, and advanced models like VideoBERT and BART. With practical coding exercises and quizzes, you’ll master embeddings, tokenization, and BERT libraries, equipping you to build cutting-edge NLP solutions.\n\nWhether you’re new to Google BERT or enhancing your expertise, this course is your guide to state-of-the-art NLP innovations.","details":"","clos":["An understanding of Google BERT’s architecture, pre-training tasks (MLM, NSP), and transformer fundamentals like self-attention and multi-head attention","The ability to apply and fine-tune pretrained BERT models for NLP tasks such as sentiment analysis, NER, question answering, and domain-specific applications","Familiarity with BERT variants (ALBERT, RoBERTa, ELECTRA) and lightweight models using knowledge distillation (DistilBERT, TinyBERT)","The ability to utilize advanced BERT applications, including text summarization (BERTSUM), multilingual models (M-BERT), and multimodal tools like VideoBERT","The ability to build real-world projects using BERT libraries like Hugging Face Transformers and apply domain-specific models like BioBERT and FinBERT"],"arabic_available":false,"page_tags":{"4742829753761792":"","6499551304482816":"","4509778905923584":"","5876354502623232":"","4698798382383104":"","5464204206407680":"","5762202140409856":"","5154145953906688":"","6635805836836864":"","5850605385154560":"","5761371903098880":"","5672614944309248":"","6464994215723008":"","6057006397128704":"","5246546387140608":"","5340244668055552":"","4541267806781440":"","6701391946186752":"","5592027998978048":"","5429274441154560":"","6714643581239296":"","5739524943773696":"","4570339635101696":"","6062561278820352":"","6189015350116352":"","5704766545199104":"","5140522162454528":"","4648084985610240":"","6219539833683968":"","6599140191764480":"","4971850433298432":"","5050864124559360":"","6431731392708608":"","4685888893485056":"","5720098244657152":"","5966695385792512":"","5897824243023872":"","5014647181934592":"","5097239084269568":"","6456371202228224":"","6549758613913600":"","5381897883222016":"","5321505307885568":"","6329838437400576":"","4758397917069312":"","6652283375583232":"","5495729628643328":"","5506784404701184":"","5274713899925504":"","6208314095173632":"","5155457086521344":"","6618298620575744":"","5742658388230144":"","5917180961751040":"","4910838197256192":"","6272600914001920":"","5155620312055808":"","6125619516276736":"","6554995349258240":"","6695774076010496":"","5626058284728320":"","5352220988801024":"","5568493412679680":"","4820899526868992":"","5054422676406272":"","6143068515074048":"","4549473811038208":"","6405186573303808":"","6738512072933376":"","4659082718609408":"","4762752240058368":"","6167653390221312":"","5871058081808384":"","6190839816257536":"","5410057409527808":"","5279388314370048":"","5222994202591232":"","4969203511328768":"","5663505946247168":"","4752354488090624":"","4979439559245824":"","6698394858553344":"","6498674701762560":"","6446375161823232":"","4741229272891392":"","6013655881351168":"","6325686379479040":"","6622428857630720":"","4899036857106432":"","6210577598513152":"","6504592030040064":"","5942371549970432":"","5636105552265216":"","5331026307710976":"","5667234133049344":"","4855757507657728":"","5578736130588672":"","6219496798552064":"","6610606402306048":"","6273575637745664":"","5036430306574336":"","4518662494748672":"","5605010047762432":"","6228184678531072":"","6201506447228928":"","4697817123389440":"","5043421464756224":"","5752883196461056":"","5874689702100992":"","6478584224677888":"","4803836981805056":"","6185064725217280":"","6202020970889216":"","5830086349291520":"","4683756314820608":"","5194621476667392":"","4940972989087744":"","5439422331617280":"","6305674151329792":"","5874595263152128":""},"collection_toc_is_enabled":true,"page_count":null,"docker":{"container":{"file":{"name":"bert.tar.gz","size":2109},"imageName":"author-10370001-collection-5503708543844352-rev-35-container-5211182323204096-bert","buildStatus":"SUCCESS","buildStatusUrl":"/api/author/10370001/collection/5503708543844352/containers/5211182323204096/build/status","buildLogUrl":"/api/author/10370001/collection/5503708543844352/containers/5211182323204096/build/log","metadata":{"sizeInBytes":2109},"id":-1,"tarballDownloadUrl":"/api/author/10370001/collection/5503708543844352/containers/5211182323204096/download","rebuildImageUrl":"/api/author/10370001/collection/5503708543844352/containers/5211182323204096/rebuild","track":false},"envs":[],"jobs":[{"key":"OjjcZssNLKXysn19Pi5Rj","jobType":"Live","name":"chap_5","inputFileName":"foo","runScript":"cp -r /usercode/chapter5/* /usr/local/notebooks && nohup jupyter notebook /usr/local/notebooks/ --allow-root --no-browser > /dev/null 2>&1 &","ports":"8080","startScript":"echo \"hello\"","runInLiveContainer":true},{"key":"O3vYVfSTBGOy573QxtXRf","jobType":"Live","name":"chap_7","inputFileName":"foo","runScript":"cp -r /usercode/chapter7/* /usr/local/notebooks && nohup jupyter notebook /usr/local/notebooks/ --allow-root --no-browser > /dev/null 2>&1 &","ports":"8080","startScript":"echo \"hello\"","runInLiveContainer":true},{"key":"l1dwWHOTGeD2QgasSj_DJ","jobType":"Live","name":"chap_11","inputFileName":"foo","runScript":"cp -r /usercode/chapter11/* /usr/local/notebooks && nohup jupyter notebook /usr/local/notebooks/ --allow-root --no-browser > /dev/null 2>&1 &","ports":"8080","startScript":"echo \"hello\"","runInLiveContainer":true},{"key":"kMJmMC5MBCQHvyGGRqbik","jobType":"Live","name":"chap_12","inputFileName":"foo","runScript":"cp -r /usercode/chapter12/* /usr/local/notebooks && nohup jupyter notebook /usr/local/notebooks/ --allow-root --no-browser > /dev/null 2>&1 &","ports":"8080","startScript":"echo \"hello\"","runInLiveContainer":true},{"key":"EpJo9pOe86F7uhEsQ_Lhc","jobType":"Live","name":"chap_13","inputFileName":"foo","runScript":"cp -r /usercode/chapter13/* /usr/local/notebooks && nohup jupyter notebook /usr/local/notebooks/ --allow-root --no-browser > /dev/null 2>&1 &","ports":"8080","startScript":"echo \"hello\"","runInLiveContainer":true},{"key":"H9vBlQyVQJWpwqAdgR5iD","jobType":"Live","name":"chap_13_document_answering","inputFileName":"foo","runScript":"cp -r /usercode/chapter13/* /usr/local/notebooks && cp -r /usercode/bbc-fulltext.zip /usr/local/notebooks && nohup jupyter notebook /usr/local/notebooks/ --allow-root --no-browser > /dev/null 2>&1 &","ports":"8080","startScript":"echo \"hello\"","runInLiveContainer":true}],"testRunners":[],"version":3,"loaded":true},"discounted_price":29,"cover_image_id":6183930695516160,"cover_image_metadata":"{\"width\":1024,\"height\":512,\"sizeInBytes\":41630,\"name\":\"_Packt MBE - Getting Started with Google BERT .png\"}","cover_image_serving_url":"/v2api/collection/10370001/5503708543844352/image/6183930695516160","tags":["transformers","encoder/decoder","nlp","bert","deep learning"],"intro_video_url":"","intro_video_thumbnail_url":"","aggregated_widget_stats":{"projects":0,"assessments":0,"SlateHTML":818,"codeExerciseCount":0,"codeRunnableCount":26,"codeSnippetCount":250,"illustrations":249,"DrawIOWidget":233,"Columns":16,"Latex":182,"Quiz":9,"Permutation":2,"EditorCode":250,"LiveApp":26,"TerminalWidget":0,"WebpackBin":0,"Table":31,"cloudlabs":0,"StructuredQuiz":3,"TableHTML":1},"default_themes":{"code_themes":{"Code":"default","Markdown":"default","RunJS":"default","SPA":"default","isForced":{"Code":false,"Markdown":false,"RunJS":false,"SPA":false}}},"api_keys":{"api_keys":[]},"skills":["Transformer Models","Machine Learning"],"testimonials":[],"licensing":null,"target_audience":"intermediate","author_id":"10370001","collection_id":"5503708543844352","approval_status":3005,"price":29,"is_private":false,"path_type":"regular","organization_id":null,"is_mini":false,"is_priced":true,"brief_summary":"Explore Google BERT, fine-tune NLP tasks, discover variants, and build real-world applications with cutting-edge transformer models.","approval_update_time":"2023-12-12T12:44:20.445Z","rating_visibility":true,"update_last_published_on_homepage":true,"show_developed_by":true,"udata_files":[],"CodeThemes":{"Code":"default","Markdown":"default","RunJS":"default","SPA":"default","isForced":{"Code":false,"Markdown":false,"RunJS":false,"SPA":false}},"is_marked_for_deletion":false,"transition_page_title":"","is_redirectable":false,"collection_type":"collection","adaptive_learning_mode":false,"HLOs_to_toc":{},"is_guide":false,"read_time":90000,"allow_logged_out_executions":false,"unique_live_widget_urls":false,"metadata_status":101,"palified_version":null},"pageSummarySSR":{"title":"Configurations of BERT","description":"Learn about some BERT configurations.","discourse_page_url":"https://discuss.educative.io/tag/configurations-of-bert__understanding-the-bert-model__getting-started-with-google-bert?open=true&ctag=getting-started-with-google-bert__packt&cslug=google-bert&pslug=configurations-of-bert"},"adaptiveLearningConfigConstantSSR":0,"enableLessonPageLockedBannerV2":true,"allowAllLessonPreview":false,"lockedBannerStatsSSR":{"b2cTrialStats":{"is_b2c_trial_active":true,"b2c_trial_active_duration":21,"b2c_trial_categories":"$137"},"b2cStatus":100,"learnerTags":"$138","workStats":1600,"interviewWorksStats":93,"inL2cStarterPack":false,"l2cWorkStats":46,"enableL2cStarterPackPaymentWidget":"false"},"pageTocSSR":"

","authorId":"10370001","collectionId":"5503708543844352","pageId":"5568493412679680","isCollectionPageLockedCachingEnabled":true,"aceFeatureFlags":{"enableAceEditor":true,"enableAceEditorForAnswers":true},"codeFeatureFlags":{"enableCodeCodeTabRedesign":"3"},"meta":{"type":["Article","TechArticle"],"title":"Configurations of BERT","name":"Getting Started with Google BERT","description":"Learn about some BERT configurations.","image":"https://educative.io/api/collection/10370001/5503708543844352/image/6183930695516160.png","isAccessibleForFree":false,"keywords":"$138","provider":"Educative","publisher":"Educative","id":"courses/google-bert/configurations-of-bert","author":"Educative","educationalLevel":"intermediate","noIndex":true,"isForcedNoIndex":true,"noFollow":false,"redirectInfo":{"isDeletedCollectionPageRedirectable":false},"page_titles":{"4742829753761792":"Introduction to the Transformer","6499551304482816":"Self-Attention Mechanism","4509778905923584":"Understanding the Self-Attention Mechanism","5876354502623232":"Multi-Head Attention Mechanism","4698798382383104":"Learning Position with Positional Encoding","5464204206407680":"Encoder: Feedforward and Add and Norm Component","5762202140409856":"Understanding the Decoder of the Transformer","5154145953906688":"Masked Multi-Head Attention","6635805836836864":"Multi-Head Attention","5850605385154560":"Decoder: Feedforward and Add and Norm Component","5672614944309248":"Putting the Encoder and Decoder Together","5761371903098880":"Putting All the Decoder Components Together","6464994215723008":"Summary: A Primer on Transformers","6057006397128704":"Understanding the Encoder of the Transformer","5246546387140608":"Putting All the Encoder Components Together","5340244668055552":"Quiz: A Primer on Transformers","4541267806781440":"Introduction to the BERT Model","6701391946186752":"Working of BERT","5592027998978048":"Pre-Training the BERT Model","5429274441154560":"The WordPiece Tokenizer","6714643581239296":"Exercise: A Primer on Transformers","5739524943773696":"Masked Language Modeling","4570339635101696":"Next Sentence Prediction","6062561278820352":"Exercise: Decoder Architecture","6189015350116352":"Pre-Training Procedure","5704766545199104":"Subword Tokenization Algorithms","5140522162454528":"Byte Pair Encoding","4648084985610240":"Byte-Level Byte Pair Encoding and WordPiece Algorithms","6219539833683968":"Summary: Understanding the BERT Model","6599140191764480":"Quiz: Understanding the BERT Model","4971850433298432":"Exploring the Pre-Trained BERT Model","5050864124559360":"Transformers","6431731392708608":"Extracting Embeddings From All Encoder Layers of BERT","4685888893485056":"Fine-Tuning BERT for Downstream Tasks","5720098244657152":"Natural Language Inference","5966695385792512":"Question-Answering","5897824243023872":"Performing Question-Answering with the Fine-Tuned BERT","5014647181934592":"Named Entity Recognition","5097239084269568":"Summary: Getting Hands-On with BERT","6456371202228224":"Sentiment Analysis","5626058284728320":"Quiz: Getting Hands-On with BERT","6549758613913600":"ALBERT","5381897883222016":"ALBERT : Training the Model","5321505307885568":"ALBERT : Embeddings Extraction","6329838437400576":"RoBERTa","4758397917069312":"RoBERTa Tokenizer","6652283375583232":"ELECTRA","5495729628643328":"Generator and Discriminator of the ELECTRA Model","5506784404701184":"Training the ELECTRA Model","5274713899925504":"SpanBERT","6208314095173632":"SpanBERT: Exploring Architecture","5155457086521344":"Summary: Different BERT Variants","6618298620575744":"Quiz: Different BERT Variants","5742658388230144":"Knowledge Distillation","5917180961751040":"Training the Student Network","4910838197256192":"DistilBERT","6272600914001920":"Training the Student BERT (DistilBERT)","5155620312055808":"TinyBERT","6125619516276736":"Teacher-Student Architecture","6554995349258240":"Distillation of Transformer Layer","6695774076010496":"Distillation of Embedding and Prediction Layer","5352220988801024":"About the Course","5568493412679680":"Configurations of BERT","4820899526868992":"Pre-Training Strategies for the BERT Model","5054422676406272":"Introduction: BERT","6143068515074048":"Introducion: BERT Variants","4549473811038208":"Distillation Techniques for Pre-training and Fine-tuning","6405186573303808":"Transferring Knowledge from BERT to Neural Networks","6738512072933376":"The Data Augmentation Methods","4659082718609408":"Summary: BERT Variants—Based on Knowledge Distillation","4762752240058368":"Introduction: Applications of BERT","6167653390221312":"Text Summarization","5871058081808384":"Fine-Tuning BERT for Extractive Summarization","6190839816257536":"BERTSUM for Extractive Summary","5410057409527808":"BERTSUM for Abstractive Summarization","5279388314370048":"ROUGE Evaluation Metrics","5222994202591232":"Training the BERTSUM Model","4969203511328768":"Summary: Exploring BERTSUM for Text Summarization","5663505946247168":"Understanding Multilingual BERT","4752354488090624":"Evaluating M-BERT on the NLI task","4979439559245824":"How Multilingual is Multilingual BERT?","6698394858553344":"M-BERT Generalization","6498674701762560":"Effect of Code-Switching and Transliteration","6446375161823232":"The Cross-Lingual Language Model (XLM)","4741229272891392":"The XLM-R Model","6013655881351168":"Language-Specific BERT","6325686379479040":"BETO for Spanish","6622428857630720":"BERTje for Dutch","4899036857106432":"Quiz: BERT Variants—Based on Knowledge Distillation","6210577598513152":"Quiz: Exploring BERTSUM for Text Summarization","6504592030040064":"German BERT","5942371549970432":"Chinese BERT","5636105552265216":"Japanese BERT","5331026307710976":"FinBERT for Finnish","5667234133049344":"BERT Models for Italian and Portuguese","4855757507657728":"RuBERT for Russian","5578736130588672":"Summary: Applying BERT to Other Languages","6219496798552064":"Sentence-BERT","6610606402306048":"Sentence-BERT with a Siamese Network","6273575637745664":"Sentence-BERT with a Triplet Network","5036430306574336":"Use Cases of Sentence-BERT Model","4518662494748672":"Learning Multilingual Embeddings Through Knowledge Distillation","5605010047762432":"Multilingual Sentence-BERT Model","6228184678531072":"Domain-Specific BERT","6201506447228928":"BioBERT","4697817123389440":"Summary: Exploring Sentence and Domain-Specific BERT","5043421464756224":"VideoBERT Model","5752883196461056":"Pre-training Dataset and Applications of VideoBERT","5874689702100992":"BART Model","6478584224677888":"Noising Techniques","4803836981805056":"Performing Text Summarization with BART","6202020970889216":"Building a Document Answering Model","6185064725217280":"Exploring BERT Libraries","5830086349291520":"Document Summarization","4683756314820608":"Summary: Working with VideoBERT, BART, and More","5194621476667392":"Quiz: Applying BERT to Other Languages","4940972989087744":"Quiz: Exploring Sentence and Domain-Specific BERT","5439422331617280":"Quiz: Working with VideoBERT, BART, and More","6305674151329792":"What We've Learned","5874595263152128":"Data Augmentation"},"is_marked_for_deletion":false,"transition_page_title":"","is_redirectable":false,"deleted_course_lesson_redirect":{"author_id":null,"collection_id":null,"page_id":null,"redirect_url_slug":null},"metadata_status":101,"additional_course_alternatives":[]},"requestUrl":"/courses/google-bert/configurations-of-bert","requestUrlInfo":{"authorId":10370001,"collectionId":5503708543844352,"pageId":5568493412679680,"courseUrlSlug":"google-bert","pageUrlSlug":"configurations-of-bert"},"isExternalContent":false}}],[["$","script",null,{"id":"generate-data","type":"application/ld+json","dangerouslySetInnerHTML":{"__html":"$139"}}],false,"$undefined"]]

1.Before We Start

2.Starting Off with BERT

3.A Primer on Transformers

Project

Semantic Search with Transformers

4.Understanding the BERT Model

5.Getting Hands-On with BERT

6.Exploring BERT Variants

7.Different BERT Variants

8.BERT Variants—Based on Knowledge Distillation

9.Applications of BERT

10.Exploring BERTSUM for Text Summarization

11.Applying BERT to Other Languages

12.Exploring Sentence and Domain-Specific BERT

13.Working with VideoBERT, BART, and More

14.Conclusion

Project

Similarity Detection in English Language Using RoBERTa

Configurations of BERT

Standard configurations of BERT

BERT-base