Building Advanced Deep Learning and NLP Projects/

...

Convolution

In this lesson, we are going to learn about the first step in implementing the Convolutional Neural Network: the convolution operation.

We'll cover the following...

- What is a convolution operation?
- The need for the convolution operation
- How do CNNs actually perform convolution?

What is a convolution operation?

In mathematical terms, it is a function derived from two functions, which, when integrated, define the change of the shape of one another. Sounds complicated?

That’s okay. We will discuss the step-by-step process which will help you understand how it works. But before that, we need to take a look at the formula:

f(x * g)(t) = \int_{-\infty}^\infty f(\tau)g(t - \tau)\,\mathrm{d}\tau

Those of you who have practiced any field that entails signal processing are probably familiar with the convolution function.

But don’t worry, we will now look at an example of how this operation actually works.

Let’s say we have an image of a smiley face (just a simple image for better understanding; the same concept applies to complex images too). We will now create a matrix assigning a value of 0 at no color and a value of 1 at black color. See the image below to understand this concept.

Follow these steps to create a feature map:

Place the feature detector on the top-left corner of your input image, count the number of matching cells, and then add this count to the top-left corner of the feature map matrix.
Repeat this step by shifting the feature detector to the right by one pixel. This shifting is called a stride and since it is shifting by one pixel, it is known as the stride of one pixel. You can have a stride of more than one pixel, but it can leave out some important features in your image.
For this example, we got a value of 0 in the first cell because there is no match in any of the cells of the filter and the input image.
After you have gone through the whole first row, you can move over to the next row and go through the same process.

Now go through the complete input image and check your output, whether it matches the correct one or not.

By the way, just like a feature detector can be referred to as a kernel or a filter, a feature map is also known as an activation map, and both terms are also interchangeable.

The need for the convolution operation

The main reason is to reduce the size of the input image. Also, the larger your strides are (the movements across pixels), the smaller your feature map is. This is due to the fact that strides are the movement of the filter over the image, and if we take large strides, the filter will skip many portions of the image and generate smaller feature maps. When dealing with proper images, you will find it necessary to widen your strides. Here, we were dealing with a 7 x 7 input image, but real images tend to be substantially larger and more complex.

How do CNNs actually perform convolution?

The example we gave above is a very simplified one. In reality, Convolutional Neural Networks develop multiple feature detectors and use them to develop several feature maps, which are referred to as convolutional layers (see the figure below).

Welcome to the Course

Project: Build a COVID-19 Detection System Using X-Rays

Project: Building a Pokemon Classifier Using Transfer Learning

Project: Text Generation Using Markov Chains

Word Embedding: Two Mini Projects

Project: IMDB Reviews Sentiment Analysis

Project: Deciphering Text Using Character-Level RNNs

Project: Emoji Predictor Using Transfer Learning in NLP

Final Exam

Where to Go Next?

Convolution

What is a convolution operation?

The need for the convolution operation

How do CNNs actually perform convolution?