Rectified Linear Unit (ReLU) is one of the most popular activation functions in deep learning. Its purpose is to introduce non-linearity into the neural network.
In simple terms, neural networks are composed of layers of interconnected neurons. Each connection between neurons has a weight, and the weighted sum of these connections is calculated. Without activation functions like ReLU, a neural network would essentially be a series of linear operations, making it no more powerful than a single-layer linear regression.
Non-linearity is essential because most real-world problems are not linear in nature. By allowing the network to learn complex relationships, ReLU helps neural networks become more expressive and capable of solving a wide range of problems, including image classification, natural language processing, and more.
In the ReLU activation function, if the input is positive, it returns the input as is, and if it’s negative, it outputs zero.
It is a simple mathematical function defined as:
Where,
Let’s see the implementation of the ReLU activation function in PyTorch.
import torchimport torch.nn as nn# Create a tensor with some valuesx = torch.tensor([-5.0, 2.5, -1.75, 1.6])# Define the ReLU activation functionrelu = nn.ReLU()# Apply ReLU to the tensoroutput = relu(x)# Print the resultprint(output)
Lines 1–2: We import the torch
library for tensor creation and the torch.nn
library to load activation functions.
Line 5: We create a 1-dimensional tensor x
.
Line 8: We create an instance of the ReLU
activation function from PyTorch’s neural network module nn
and store it in the relu
variable.
Lines 11–14: We apply the ReLU
activation function to the tensor x
and print the output to the console.
Let’s see the implementation of the ReLU activation function in a neural network using PyTorch.
import torchimport torch.nn as nn# Define the neural network architectureclass Neural_Network(nn.Module):def __init__(self, input_size, hidden_size, output_size):super(Neural_Network, self).__init__()self.fc1 = nn.Linear(input_size, hidden_size) # Fully connected layer 1self.relu = nn.ReLU() # ReLU activation functionself.fc2 = nn.Linear(hidden_size, output_size) # Fully connected layer 2self.sigmoid = nn.Sigmoid()def forward(self, x):out = self.fc1(x) # Apply the first fully connected layerout = self.relu(out) # Apply the ReLU activation functionout = self.fc2(out) # Apply the second fully connected layerout = self.sigmoid(out)return out# Define network parametersinput_size = 64 # Number of input featureshidden_size = 128 # Number of neurons in the hidden layeroutput_size = 2 # Number of output classes# Input datainput_data = torch.rand(32, input_size) # 32 is the batch sizetarget = torch.randint(0, 2, (32, output_size), dtype=torch.float32) # Random binary target values# Create an instance of the SimpleNN modelmodel = Neural_Network(input_size, hidden_size, output_size)# Define the loss function (Cross Entropy Loss) and optimizer (Adam)criterion = nn.BCELoss()optimizer = torch.optim.Adam(model.parameters(), lr=0.01)# Example training loopn_epochs = 10 # Define the number of training epochsfor epoch in range(n_epochs):# Forward passoutputs = model(input_data)loss = criterion(outputs, target) # Compute the loss# Backward pass and optimizationoptimizer.zero_grad() # Clear gradientsloss.backward() # Backpropagate to compute gradientsoptimizer.step() # Update the model parameters# Print the loss for each epochprint(f'Epoch [{epoch + 1}/{n_epochs}], Loss: {loss.item()}')
Line 9: We create an instance of the ReLU activation function using the nn.ReLU()
command, and store it as an attribute of the SimpleNN
class named self.relu
.
Line 15: We apply the ReLU activation function to the output of the first fully connected layer. It replaces any negative values in the out
tensor with zero, effectively introducing non-linearity into the network. This activation function helps the network learn complex patterns and is used to introduce non-linearity into the model.
For more details on how to build a neural network take a look at this Educative Answer.