SimCLR Training Objective
Explore the SimCLR training objective by understanding how two augmented image views are processed through a neural network backbone and MLP projection head to produce feature embeddings. Learn how the contrastive loss uses similarity maximization between positive pairs and minimizes similarity among negative pairs by computing a similarity matrix. This lesson guides you step-by-step in implementing the SimCLR contrastive loss, preparing you to build self-supervised learning models for unlabeled data.
We'll cover the following...
Now that we have two augmented versions of the input batch,
Network architecture
As shown in the figure below, the two augmented versions of an image,
The code example below implements the class SimCLR_Network that passes the input image to a resnet18 backbone (
Line 10: We implement the class
SimCLR_Networkthat passes the input image to aresnet18backbone () and an MLP projection head ( ). Line 13: We define the feature backbone
self.backboneas aresnet18network.Lines 14–15: We remove the fully connected classification layer
resnet18by reinitializing it as annn.Identity()layer. Theself.backbonetakes an image () and returns —a dimensional features vector. Lines 18–24: We define the projection head
self.projectionas an MLP layer using thenn.Linear,nn.ReLUandnn.BatchNorm1dlayers. This projection layer takesresnet18's-dimensional features from self.backboneand ...