CNN Building Blocks
Get to know the most commonly used CNN building blocks.
We'll cover the following...
Convolution layer
A neural network wouldn’t be named convolutional if it didn’t have at least one convolution layer. A two-dimensional convolution layer scans the input tensor with a small two-dimensional tensor (the convolution kernel) that has the same number of channels as the input tensor. For each pixel in the input tensor, a scalar product gets computed between the neighborhood around the central pixel and the convolution kernel. The result is a tensor that has a high activation where the input tensor looks like the convolution kernel.
To illustrate the functioning of a convolution layer, let’s design one by hand.
Consider the following image:
Let’s assume we need to highlight the areas where a green surface touches a yellow surface through a vertical boundary. To do that, we’ll convolve the image with the following kernel:
We can manually design the convolution kernel by accessing the weight
and bias
fields of a torch.nn.Conv2d
object:
import torchimport cv2weight = torch.zeros(1, 3, 5, 5) # (N_kernels, N_channels, W, H)bias = torch.zeros(1)weight[0, 0, :, 0: 2] = 90 # The green band on the leftweight[0, 1, :, 0: 2] = 120weight[0, 2, :, 0: 2] = 35weight[0, 0, :, 2:] = 75 # The yellow band on the rightweight[0, 1, :, 2:] = 160weight[0, 2, :, 2:] = 160cv2.imwrite("./output/0_kernel_weight.png", cv2.resize(torch.moveaxis(weight.squeeze(0), 0, 2).int().numpy(), dsize=(300, 300), interpolation=cv2.INTER_NEAREST))# Standardize the weightweight = (weight - torch.mean(weight))/torch.std(weight)bias[0] = 0.
In lines 6–11, we manually set the BGR values of the weight
tensor. In line 16, the bias
value is set to 0
.
We can now perform a convolution of the image with a kernel that will use this weight
and bias
:
# ... continued# Create a 2D convolution layer with the designed weight and biasconv = torch.nn.Conv2d(3, 1, kernel_size=(3, 3), padding='same')# The tensors must be wrapped in torch.nn.Parameter objectsconv.weight = torch.nn.Parameter(weight)conv.bias = torch.nn.Parameter(bias)# Load an imageoriginal_img = cv2.imread("./images/electronics/rpi_back.jpg")cv2.imwrite("./output/1_original.png", original_img)# Convert the image to a tensorinput_tsr = torch.from_numpy(original_img).float()/255.0 # (H, W, C)input_tsr = torch.moveaxis(input_tsr, 2, 0) # (C, H, W)# Compute the convolution on the input tensorconvolution_tsr = conv(input_tsr)# Save the convolution imageconvolution_img = convolution_tsr.squeeze(0).detach().numpy() # (H, W)cv2.imwrite("./output/2_convolution.png", 127 + 10 * convolution_img)
In lines 5 and 6, we set the weight
...