CNN Building Blocks

Explore the building blocks of convolutional neural networks, including convolution layers, dropout 2D, batch normalization, and max pooling. Understand how these components work together to extract features, reduce dimensionality, and improve training effectiveness for image classification tasks.

We'll cover the following...

Convolution layer
Dropout 2D
Batch normalization 2D
MaxPool 2D
Flattening

Convolution layer

A neural network wouldn’t be named convolutional if it didn’t have at least one convolution layer. A two-dimensional convolution layer scans the input tensor with a small two-dimensional tensor (the convolution kernel) that has the same number of channels as the input tensor. For each pixel in the input tensor, a scalar product gets computed between the neighborhood around the central pixel and the convolution kernel. The result is a tensor that has a high activation where the input tensor looks like the convolution kernel.

To illustrate the functioning of a convolution layer, let’s design one by hand.

Consider the following image:

C++

import torch
import cv2
weight = torch.zeros(1, 3, 5, 5)  # (N_kernels, N_channels, W, H)
bias = torch.zeros(1)
weight[0, 0, :, 0: 2] = 90   # The green band on the left
weight[0, 1, :, 0: 2] = 120
weight[0, 2, :, 0: 2] = 35
weight[0, 0, :, 2:] = 75  # The yellow band on the right
weight[0, 1, :, 2:] = 160
weight[0, 2, :, 2:] = 160
cv2.imwrite("./output/0_kernel_weight.png", cv2.resize(torch.moveaxis(weight.squeeze(0), 0, 2).int().numpy(), dsize=(300, 300), interpolation=cv2.INTER_NEAREST))
# Standardize the weight
weight = (weight - torch.mean(weight))/torch.std(weight)
bias[0] = 0.

C++

# ... continued
# Create a 2D convolution layer with the designed weight and bias
conv = torch.nn.Conv2d(3, 1, kernel_size=(3, 3), padding='same')
# The tensors must be wrapped in torch.nn.Parameter objects
conv.weight = torch.nn.Parameter(weight)
conv.bias = torch.nn.Parameter(bias)
# Load an image
original_img = cv2.imread("./images/electronics/rpi_back.jpg")
cv2.imwrite("./output/1_original.png", original_img)
# Convert the image to a tensor
input_tsr = torch.from_numpy(original_img).float()/255.0  # (H, W, C)
input_tsr = torch.moveaxis(input_tsr, 2, 0)  # (C, H, W)
# Compute the convolution on the input tensor
convolution_tsr = conv(input_tsr)
# Save the convolution image
convolution_img = convolution_tsr.squeeze(0).detach().numpy()  # (H, W)
cv2.imwrite("./output/2_convolution.png", 127 + 10 * convolution_img)

1.Introduction

2.Getting Started with Images

Assessment

3.Color Spaces and Thresholding

Assessment

4.Smoothing and Masking

5.Detection of Features

6.Image Registration

7.3D Vision

8.Getting Started with Neural Networks

9.Convolutional Neural Networks

Mini Project

10.Object Detection and Semantic Segmentation

Project

11.Dataset Annotation

12.Final Remarks

Project

CNN Building Blocks

Convolution layer