The Cross-Lingual Language Model (XLM)

Explore the Cross-Lingual Language Model (XLM) and its pre-training strategies including Causal Language Modeling, Masked Language Modeling, and Translation Language Modeling. Understand how XLM leverages monolingual and parallel datasets to learn multilingual representations, surpassing multilingual BERT in performance, and how it can be fine-tuned for cross-lingual NLP tasks.

We'll cover the following...

Training dataset
Pre-training strategies
Pre-training the XLM model
Evaluation of XLM

The M-BERT model is pre-trained just like the regular BERT model, without any specific cross-lingual objective. In this lesson, let's learn how to pre-train BERT with a cross-lingual objective. We refer to BERT trained with a cross-lingual objective as a cross-lingual language model (XLM). The XLM model performs better than M-BERT, and it learns cross-lingual representations.

Training dataset

The XLM model is pre-trained using the monolingual and parallel datasets. The parallel dataset consists of text in a language pair; that is, it consists of the same text in two different languages. Say we have an English sentence, and then we will have a corresponding sentence in another language, French, for example. We can call this parallel dataset a cross-lingual dataset.

1.Before We Start

2.Starting Off with BERT

3.A Primer on Transformers

Project

4.Understanding the BERT Model

5.Getting Hands-On with BERT

6.Exploring BERT Variants

7.Different BERT Variants

8.BERT Variants—Based on Knowledge Distillation

9.Applications of BERT

10.Exploring BERTSUM for Text Summarization

11.Applying BERT to Other Languages

12.Exploring Sentence and Domain-Specific BERT

13.Working with VideoBERT, BART, and More

14.Conclusion

Project

The Cross-Lingual Language Model (XLM)

Training dataset