BETO for Spanish

Learn about BETO and how to use it to predict masked words.

BETO is the pre-trained BERT model for the Spanish language from the Universidad de Chile. It is trained using the MLM task with Whole World Masking (WWM). The configuration of BETO is the same as the standard BERT-base model.

Variants of BETO

The researchers of BETO provided two variants of the BETO model.

  • BETO-cased for the cased text.

  • BETO-uncased for the uncased text.

Performance of BETO

The pre-trained BETO model is open sourced, so we can download it directly and use it for downstream tasks. Researchers have also shown that the performance of BETO is better than M-BERT for many downstream tasks, as indicated here:

Get hands-on with 1200+ tech skills courses.