CNN Building Blocks
Explore the building blocks of convolutional neural networks, including convolution layers, dropout 2D, batch normalization, and max pooling. Understand how these components work together to extract features, reduce dimensionality, and improve training effectiveness for image classification tasks.
We'll cover the following...
Convolution layer
A neural network wouldn’t be named convolutional if it didn’t have at least one convolution layer. A two-dimensional convolution layer scans the input tensor with a small two-dimensional tensor (the convolution kernel) that has the same number of channels as the input tensor. For each pixel in the input tensor, a scalar product gets computed between the neighborhood around the central pixel and the convolution kernel. The result is a tensor that has a high activation where the input tensor looks like the convolution kernel.
To illustrate the functioning of a convolution layer, let’s design one by hand.
Consider the following image:
Let’s assume we need to highlight the areas where a green surface touches a yellow surface through a vertical boundary. To do that, we’ll convolve the image with the following kernel:
We can manually design the convolution kernel by accessing the weight and bias fields of a torch.nn.Conv2d object:
In lines 6–11, we manually set the BGR values of the weight tensor. In line 16, the bias value is set to 0.
We can now perform a convolution of the image with a kernel that will use this weight and bias:
In lines 5 and 6, we set the weight ...