...

BioBERT

Learn about the BioBERT domain-specific BERT model and how to pre-train and fine-tune it for NER and question-answering tasks..

We'll cover the following...

Pre-training the BioBERT model
Fine-tuning the BioBERT model
- BioBERT for NER tasks
- BioBERT for question-answering

As the name suggests, BioBERT is a biomedical domain-specific BERT model pre-trained on a large biomedical corpus. Since BioBERT understands biomedical domain-specific representations, once pre-trained, BioBERT performs better than the vanilla BERT on biomedical texts. The architecture of BioBERT follows the same as the vanilla BERT model. After pre-training, we can fine-tune BioBERT for many biomedical domain-specific downstream tasks, such as biomedical question answering, biomedical named entity recognition, and more.

Pre-training the BioBERT model

BioBERT is pre-trained using biomedical domain-specific texts. We use the biomedical datasets from the following two sources:

PubMed: This is a citation database. It includes more than 30 million citations for biomedical literature from life science journals, online books, and MEDLINE (an index of the biomedical journal, the National Library of Medicine).
PubMed Central (PMC): This is a free online repository that includes articles that have been published in biomedical and life sciences journals.

BioBERT is pre-trained using PubMed abstracts and PMC ...

Before We Start

Starting Off with BERT

A Primer on Transformers

Understanding the BERT Model

Getting Hands-On with BERT

Exploring BERT Variants

Different BERT Variants

BERT Variants—Based on Knowledge Distillation

Applications of BERT

Exploring BERTSUM for Text Summarization

Semantic Search with Transformers

Applying BERT to Other Languages

Exploring Sentence and Domain-Specific BERT

Working with VideoBERT, BART, and More

Conclusion

Similarity Detection in English Language Using RoBERTa

BioBERT

Pre-training the BioBERT model