Coding Exercise on Masked Modeling
Explore how to implement a focal masking strategy for masked image modeling in self-supervised learning. Learn to generate focal masks, apply them in SimMIM training, and optimize image reconstruction using continuous local image blocks to mask and reconstruct images.
We'll cover the following...
Problem statement
So far while implementing masked image modeling algorithms (SimMIM, MAEs, MSNs), we've used random patch-dropping as our masking strategy. In this exercise, we'll implement another masking strategy called focal masking. Focal masking selects a continuous local block of an image and masks everything around it. The figure below illustrates the difference between random and focal masking.
The following code snippet implements the FocalMaskGenerator class, which takes input_size, mask_patch_size, and block_size as parameters in its __init__ call and defines a variable, self.rand_size, which denotes the number of patches along one dimension