Search⌘ K
AI Features

Masked Siamese Networks: Objective and Training

Explore the training objectives of Masked Siamese Networks by understanding how similarity between anchor and target views is measured using learnable prototypes. Learn to implement the cross-entropy loss and mean entropy maximization regularizer to train encoders that produce confident, diverse predictions for self-supervised image modeling.

Similarity metric and predictions

To train the encoder, MSNs compute a distribution based on the similarity between a set of learnable prototypes Q={q1,q2,...,qK}Q = \{q_1, q_2, ..., q_K\} (think of each prototype, qiq_i, as dd-dimensional vectors, hence QQ is a K×dK\times d matrix) and each anchor, zimz_i^m ...