...

RuBERT for Russian

Learn about the RuBERT model for the Russian language and how it is trained by transferring knowledge from M-BERT.

We'll cover the following...

Pre-training the RuBERT model
- Common words from M-BERT and RuBERT
- Subwords in vocabulary
Coding playground

RuBERT is the pre-trained BERT for the Russian language. RuBERT is trained differently from other BERT variants.

Pre-training the RuBERT model

RuBERT is trained by transferring knowledge from M-BERT. We know that M-BERT is trained on Wikipedia text of 104 languages and has good knowledge of each language. So, instead of training the monolingual RuBERT from scratch, we train it by obtaining knowledge from M-BERT. Before training, we initialize all the parameters of RuBERT with the parameters of the M-BERT model, except the word embeddings. ...

Before We Start

Starting Off with BERT

A Primer on Transformers

Understanding the BERT Model

Getting Hands-On with BERT

Exploring BERT Variants

Different BERT Variants

BERT Variants—Based on Knowledge Distillation

Applications of BERT

Exploring BERTSUM for Text Summarization

Semantic Search with Transformers

Applying BERT to Other Languages

Exploring Sentence and Domain-Specific BERT

Working with VideoBERT, BART, and More

Conclusion

Similarity Detection in English Language Using RoBERTa

RuBERT for Russian

Pre-training the RuBERT model