Summary: Applying BERT to Other Languages

Let's summarize what we have learned so far.

We'll cover the following

Key highlights

Summarized below are the main highlights of what we've learned in this chapter.

  • We started off by understanding how the M-BERT model works. We learned that M-BERT is trained without any cross-lingual objective, just like how we trained the BERT model, and it produces a representation that generalizes across multiple languages for downstream tasks.

  • We investigated how multilingual our M-BERT is. We learned that M-BERT's generalizability does not depend on the vocabulary overlap, relying instead on typological and language similarity. We also saw that M-BERT can handle code-switched text but not transliterated text.

Get hands-on with 1200+ tech skills courses.