Convolution

Explore the convolution operation and its role in Convolutional Neural Networks for image analysis. Learn how feature detectors transform input images into feature maps, reducing image size and highlighting important details. This lesson helps you grasp the fundamentals needed to build a COVID-19 detection system using X-ray images.

We'll cover the following...

- What is a convolution operation?
- The need for the convolution operation
- How do CNNs actually perform convolution?

What is a convolution operation?

In mathematical terms, it is a function derived from two functions, which, when integrated, define the change of the shape of one another. Sounds complicated?

That’s okay. We will discuss the step-by-step process which will help you understand how it works. But before that, we need to take a look at the formula:

f(x * g)(t) = \int_{-\infty}^\infty f(\tau)g(t - \tau)\,\mathrm{d}\tau

Those of you who have practiced any field that entails signal processing are probably familiar with the convolution function.

But don’t worry, we will now look at an example of how this operation actually works.

Let’s say we have an image of a smiley face (just a simple image for better understanding; the same concept applies to complex images too). We will now create a matrix assigning a value of 0 at no color and a value of 1 at black color. See the image below to understand this concept.

Follow these steps to create a feature map:

Place the feature detector on the top-left corner of your input image, count the number of matching cells, and then add this count to the top-left corner of the feature map matrix.
Repeat this step by shifting the feature detector to the right by one pixel. This shifting is called a stride and since it is shifting by one pixel, it is known as the stride of one pixel. You can have a stride of more than one pixel, but it can leave out some important features in your image.
For this example, we got a value of 0 in the first cell because there is no match in any of the cells of the filter and the input image.
After you have gone through the whole first row, you can move over to the next row and go through the same process.

Now go through the complete input image and check your output, whether it matches the correct one or not.

By the way, just like a feature detector can be referred to as a kernel or a filter, a feature map is also known as an activation map, and both terms are also interchangeable.

The need for the convolution operation

The main reason is to reduce the size of the input image. Also, the larger your strides are (the movements across pixels), the smaller your feature map is. This is due to the fact that strides are the movement of the filter over the image, and if we take large strides, the filter will skip many portions of the image and generate smaller feature maps. When dealing with proper images, you will find it necessary to widen your strides. Here, we were dealing with a 7 x 7 input image, but real images tend to be substantially larger and more complex.

How do CNNs actually perform convolution?

The example we gave above is a very simplified one. In reality, Convolutional Neural Networks develop multiple feature detectors and use them to develop several feature maps, which are referred to as convolutional layers (see the figure below).

1.Welcome to the Course

2.Project: Build a COVID-19 Detection System Using X-Rays

3.Project: Building a Pokemon Classifier Using Transfer Learning

4.Project: Text Generation Using Markov Chains

5.Word Embedding: Two Mini Projects

6.Project: IMDB Reviews Sentiment Analysis

7.Project: Deciphering Text Using Character-Level RNNs

8.Project: Emoji Predictor Using Transfer Learning in NLP

9.Final Exam

10.Where to Go Next?

Convolution

What is a convolution operation?

The need for the convolution operation

How do CNNs actually perform convolution?