Search⌘ K
AI Features

Autoencoders

Explore the concept of autoencoders, neural networks designed to perform nonlinear dimensionality reduction by encoding data into a compressed latent space and decoding it back. Understand how autoencoders overcome PCA limitations, learn their architecture including encoder and decoder components, and see their practical uses in image denoising and anomaly detection.

In the previous lesson, we used PCA for linear data compression, but real-world data is often too complex and tangled to be simplified by straight lines alone. Autoencoders represent the next step in dimensionality reduction, using the power of neural networks to learn highly effective nonlinear representations of data. This unique network architecture learns to compress data into a compact code and then accurately reconstruct it, allowing us to capture intricate patterns for tasks like noise removal and anomaly detection.

Nonlinear dimensionality reduction

Datasets might not always conform to a linear subspace. In such cases, employing linear techniques like PCA for dimensionality reduction proves ineffective. To address this, nonlinear dimensionality reduction techniques come into play. In this approach, data points undergo encoding/transforming via a nonlinear function. Let’s consider a scenario with nn data points existing in a dd-dimensional space, organized as columns in a matrix denoted as Xd×nX_{d \times n}. The aim is to derive corresponding kk-dimensional nonlinear encodings (latent spaceCompressed internal representation of the input data learned by the encoder.), denoted as Zk×nZ_{k \times n}, achieved through the following process:

Z=fWe(X)Z = f_{W_e}(X)

Here, ff represents a nonlinear parametric function, often referred to as an encoder, driven by parameters WeW_e. If the goal is to encode in a manner that permits successful decoding or reconstruction of the original data, a nonlinear function gWdg_{W_d}, known as a decoder, can be formulated as follows:

X^=gWd(Z)\hat{X} = g_{W_d}(Z)

Minimization of the squared loss XX^2|X-\hat{X}|^2 is carried out to estimate the parameters for both the encoder and decoder. This process facilitates learning representations that encapsulate the salient features of the data in a manner that transcends the limitations of linear approaches like PCA.

Note:

  • It’s a prevalent practice to represent both the encoder and decoder as neural networks, often referred to as encoder-decoder networks.

  • When the intention of the encoder-decoder network is to achieve dimensionality reduction, the network is commonly referred to as an autoencoder.

Autoencoders are a specific type of neural network architecture used for unsupervised learning tasks. They’re designed to take high-dimensional input data, encode it into a lower-dimensional representation, and then decode that representation back into the original format. This allows autoencoders to be used for tasks such as dimensionality reduction and feature extraction.

Architecture of an autoencoder
Architecture of an autoencoder

Unveiling the need for encoders

In the vast universe of data, we often encounter high-dimensional and noisy information. To make sense of such data, we require a mechanism to extract its essential features. An encoder acts as a detective, distilling the most important characteristics from the data and transforming it into a compressed representation. This condensed information holds the key to understanding the underlying patterns hidden within.

An encoder consists of layers that progressively extract higher-level features from the input. The architecture of the encoder is as follows:

Python 3.10.4
import torch.nn.functional as F
import torch.nn as nn
class Encoder(nn.Module):
def __init__(self, input_dim, hidden_dim, output_dim):
super(Encoder, self).__init__()
self.input_layer = nn.Linear(input_dim, hidden_dim)
self.hidden_layer = nn.Linear(hidden_dim, hidden_dim)
self.output_layer = nn.Linear(hidden_dim, output_dim)
def forward(self, x):
x = F.relu(self.input_layer(x))
x = F.relu(self.hidden_layer(x))
x = F.relu(self.output_layer(x))
return x

Here is the explanation for the code above:

  • Lines 5–9: With __init__, we initialize the neural network layers: an input layer, a hidden layer, and an output layer. These layers are like building blocks that transform input data into a compact representation.

  • Lines 11–15: With forward, we define how data flows through the layers. It uses ReLU activation to process the input sequentially, compressing it as it moves through each layer, ultimately producing an encoded representation.

1.

What is the primary purpose of encoders in autoencoders, and are they exclusively designed for data compression?

Show Answer
Did you find this helpful?

The power of decoders

Where encoders excel at solving the puzzle of data compression, decoders step in as creative problem solvers. They unwind the compacted representation, piece by piece, and quickly bring back the original data. Decoders act as essential intermediaries, connecting the reduced dimensions of encoded data to the complex, high-dimensional input, resulting in a breathtakingly lifelike reconstructed output.

Python 3.10.4
import torch.nn.functional as F
import torch.nn as nn
class Decoder(nn.Module):
def __init__(self, input_dim, hidden_dim, output_dim):
super(Decoder, self).__init__()
self.input_layer = nn.Linear(input_dim, hidden_dim)
self.hidden_layer = nn.Linear(hidden_dim, hidden_dim)
self.output_layer = nn.Linear(hidden_dim, output_dim)
def forward(self, x):
x = F.relu(self.input_layer(x))
x = F.relu(self.hidden_layer(x))
x = F.relu(self.output_layer(x))
return x.reshape(-1, *output_shape)

Heere is the explanation for the code above:

  • Lines 5–9: With __init__, we initialize the decoder layers. The decoder has a similar structure to the encoder (input layer to hidden layer to output layer), but its role is to reconstruct the original input from the latent representation. These layers progressively transform the latent vector back into the original data space.

  • Lines 11–15: With forward, we define the data flow through the decoder. Like the encoder, it uses ReLU activation to sequentially process the encoded input, gradually expanding it back to its original form. The reshape operation at the end adjusts the output shape to match the desired output.

Designing the autoencoder

Autoencoders consist of three essential components: an encoder, a decoder, and a bottleneck layer in between. The encoder is responsible for mapping the input data to a compressed representation, while the decoder reconstructs the original input from this compressed representation. The bottleneck layer, often called the latent space or encoding layer, acts as a critical bridge between the encoder and decoder. It enforces dimensionality reduction, forcing the encoder to capture the most salient features of the data. The decoder then uses this reduced representation to reconstruct the original input.

The following code defines a symmetrical Autoencoder network that compresses 64 input features down to a 16-feature latent space and then expands the data back to 64 features for reconstruction. The goal is to learn the most efficient 16-dimensional representation of the original data.

Let’s explore a simple implementation of an autoencoder.

Python 3.10.4
import torch
import torch.nn.functional as F
import torch.nn as nn
from torchsummary import summary
# Define the encoder class
class Encoder(nn.Module):
def __init__(self, input_dim, hidden_dim, output_dim):
super(Encoder, self).__init__()
self.input_layer = nn.Linear(input_dim, hidden_dim)
self.hidden_layer = nn.Linear(hidden_dim, hidden_dim)
self.output_layer = nn.Linear(hidden_dim, output_dim)
def forward(self, x):
x = F.relu(self.input_layer(x))
x = F.relu(self.hidden_layer(x))
x = F.relu(self.output_layer(x))
return x
# Define the decoder class
class Decoder(nn.Module):
def __init__(self, input_dim, hidden_dim, output_dim):
super(Decoder, self).__init__()
self.input_layer = nn.Linear(input_dim, hidden_dim)
self.hidden_layer = nn.Linear(hidden_dim, hidden_dim)
self.output_layer = nn.Linear(hidden_dim, output_dim)
def forward(self, x):
x = F.relu(self.input_layer(x))
x = F.relu(self.hidden_layer(x))
x = F.relu(self.output_layer(x))
return x # No need for reshape
# Define the autoencoder class
class Autoencoder(nn.Module):
def __init__(self, encoder_dim, decoder_dim):
super(Autoencoder, self).__init__()
self.encoder = Encoder(*encoder_dim)
self.decoder = Decoder(*decoder_dim)
def forward(self, x):
x = self.encoder(x)
x = self.decoder(x)
return x
# Define the dimensions for the encoder and decoder
input_dim = 64
hidden_dim = 32
latent_dim = 16
encoder_dim = (input_dim, hidden_dim, latent_dim)
decoder_dim = (latent_dim, hidden_dim, input_dim)
# Create an instance of the autoencoder model
model = Autoencoder(encoder_dim, decoder_dim)
batch_size = 1 # You can adjust the batch size as needed
input_shape = (batch_size, input_dim)
summary(model, input_shape)

The explanation for the code is given below:

  • Lines 7–18: The encoder compresses input data from input_dim to output_dim through two hidden layers, applying ReLU activation functions.
  • Lines 21–32: The decoder reverses the encoding process, expanding data from output_dim back to input_dim using ReLU activations.
  • Lines 35–44: The autoencoder combines the encoder and decoder to form a neural network that learns to reconstruct input data.
  • Lines 47–58: The code defines the dimensions for the encoder and decoder, creates an instance of the autoencoder model, sets the batch size for input data (1 in this case), and generates a summary of the model’s architecture using the torchsummary library.

In output, the network begins with the encoder (Layers Linear-1 to Linear-3), where the original 64 input features are compressed sequentially through 64321664 \rightarrow 32 \rightarrow 16 dimensions. The output of the final encoder layer (Linear-3) is the 16-dimensional latent representation (the bottleneck), which contains the essential features of the input. This compressed code is then passed to the decoder (Layers Linear-5 to Linear-7). The decoder reverses the process, expanding the data back out from 16326416 \rightarrow 32 \rightarrow 64 dimensions, aiming to reconstruct the original input features. The overall model is simple yet effective, consisting of a total of 7,376 trainable parameters—the weights and biases that the network must learn to successfully compress and reconstruct the data during the training phase.

Applications and beyond

Autoencoders have gained immense popularity due to their versatility and wide range of applications. From image generation and dimensionality reduction to recommendation systems and natural language processing, they have revolutionized the field of machine learning. By extracting meaningful representations, autoencoders empower us to navigate the complex world of data with ease.

Image denoising

One of the notable strengths of autoencoders lies in their ability to perform image denoising. Through training on a dataset containing both noisy images and their corresponding clean versions, autoencoders learn to remove noise and accurately reconstruct the original images. This capability finds practical applications in domains like medical imaging and photography, where noise reduction is crucial for enhancing image quality and analysis.

Autoencoder removing noise
Autoencoder removing noise

Anomaly detection

Autoencoders demonstrate exceptional effectiveness in anomaly detection. By training them on a set of normal data, they become skilled at reconstructing this data accurately. When presented with anomalous data, the reconstruction error increases significantly, allowing autoencoders to detect outliers and identify anomalies within datasets. This property makes them invaluable tools for identifying and flagging irregular patterns or data points in various fields, including cybersecurity, fault detection, and fraud detection.

Try this quiz to review what you’ve learned so far.

1.

Which component of an autoencoder is responsible for mapping the input data to a compressed representation?

A.

Encoder

B.

Decoder


1 / 1

Conclusion

Autoencoders are a specific type of neural network designed for unsupervised learning. We learned that the model consists of an encoder to nonlinearly compress the input into a latent space (bottleneck) and a decoder to reconstruct the data. By minimizing the reconstruction loss, Autoencoders learn efficient, nonlinear feature representations that excel where linear methods like PCA fail. This capability makes them highly valuable for practical applications such as cleaning noisy images and detecting irregular patterns in data (anomaly detection).