The Principles of the Convolution
Learn the fundamental principles behind convolution in convolutional neural networks. Understand how convolutional layers preserve spatial structures by sliding kernels over images to extract features. This lesson helps you grasp key concepts such as local receptive fields, feature maps, and how convolutions capture spatial correlations, preparing you to implement convolution operations from scratch.
We'll cover the following...
Why convolution?
The fully connected layer that we saw doesn’t respect the spatial structure of the input. If, for example, the input is an image, the NN will destruct the 2D structure into a 1-dimensional vector. To address the issue, we have designed Convolutional Neural Networks (CNNs). They work exceptionally well for computer vision applications.
Why do we use them when we process images? Because we know a priori that nearby pixels share similar characteristics and we want to take that into account by design. That assumption is called the inductive bias.
Convolutional layers exploit the local structure of the data.
But how is it possible to focus on the local structure instead of fully connected layers that take linear combinations of the input?
The answer is quite simple. We restrict the convolutional layer to operate on a local window called kernel. Then, we slide this window throughout the input image.
Convolution
The basic operation of CNNs is the convolution. Mathematically, a convolution between two 2-dimensional functions is defined as:
...