Masked Autoencoders: Masking and Encoder

Learn how to implement the masking strategy and encoder layer of Masked Autoencoders (MAE).

We'll cover the following

Similar to SimMIM, a Masked Autoencoder (MAE) reconstructs the randomly masked image patches in the image pixel space by using an asymmetric decoder-encoder design where the encoder only sees visible patches (i.e., masked patches don’t participate in input). The decoder (lightweight) reconstructs the input along with the masked tokens. The figure below illustrates the idea.

Get hands-on with 1200+ tech skills courses.