Image Classification Using Convolutional Neural Networks

Learn about the convolutional neural network, its architecture, potential applications, and the reasons why it’s preferred for image classification.

In this chapter, we will primarily focus on the fundamentals of convolutional neural networks (CNNs) and their architecture. CNNs are a powerful and widely used approach to handling and analyzing images. They allow us to extract meaningful features and patterns. By exploring the capabilities of CNNs, we can effectively address complex image-related tasks. The knowledge gained in this chapter will enable us to apply CNNs to simulate mislabeled data and analyze the effect of mislabeled data on model performance.

What is a convolutional neural network?

A CNN is a type of deep neural network that effectively identifies patterns in image data. It takes the input images and learns filter (kernel) coefficients that slide over the image to extract features and patterns. These filter values are acquired during the training process through a backpropagation technique (adjusting weight and biases) that enables the network to update the filter parameters to improve a model’s performance. In a CNN, data is processed by many array layers. A CNN uses the input as a two-dimensional array that operates directly on image patches rather than individual pixels.

The slides below visualize convolution’s effect on an input image by applying a 3x3 kernel to a 5x5 matrix. The kernel slides over the image, starting from the top-left corner, and moves one pixel at a time in a left-to-right, top-to-bottom fashion. At each position, the kernel multiplies its values with the corresponding values in the image matrix, and the resulting products are added together to produce a single output value. This process continues until the kernel covers the entire image, resulting in a new matrix convolved with the kernel.

Get hands-on with 1200+ tech skills courses.