Implementation of ReLU in PyTorch

Rectified Linear Unit (ReLU) is one of the most popular activation functions in deep learning. Its purpose is to introduce non-linearity into the neural network.

In simple terms, neural networks are composed of layers of interconnected neurons. Each connection between neurons has a weight, and the weighted sum of these connections is calculated. Without activation functions like ReLU, a neural network would essentially be a series of linear operations, making it no more powerful than a single-layer linear regression.

Non-linearity is essential because most real-world problems are not linear in nature. By allowing the network to learn complex relationships, ReLU helps neural networks become more expressive and capable of solving a wide range of problems, including image classification, natural language processing, and more.

In the ReLU activation function, if the input is positive, it returns the input as is, and if it’s negative, it outputs zero.

ReLU
ReLU
1 of 4

Mathematical implementation

It is a simple mathematical function defined as:

Where, xx represents the input to the function and ReLU(x)ReLU(x) returns the maximum of 00 and the input value xx. In other words, it outputs the input value if it’s greater than or equal to zero, and it outputs zero if the input is negative.

ReLU activation function
ReLU activation function

Implementation of ReLU

Let’s see the implementation of the ReLU activation function in PyTorch.

import torch
import torch.nn as nn
# Create a tensor with some values
x = torch.tensor([-5.0, 2.5, -1.75, 1.6])
# Define the ReLU activation function
relu = nn.ReLU()
# Apply ReLU to the tensor
output = relu(x)
# Print the result
print(output)

Code explanation

Lines 1–2: We import the torch library for tensor creation and the torch.nn library to load activation functions.

Line 5: We create a 1-dimensional tensor x.

Line 8: We create an instance of the ReLU activation function from PyTorch’s neural network module nn and store it in the relu variable.

Lines 11–14: We apply the ReLU activation function to the tensor x and print the output to the console.

Implementation in PyTorch

Let’s see the implementation of the ReLU activation function in a neural network using PyTorch.

import torch
import torch.nn as nn
# Define the neural network architecture
class Neural_Network(nn.Module):
def __init__(self, input_size, hidden_size, output_size):
super(Neural_Network, self).__init__()
self.fc1 = nn.Linear(input_size, hidden_size) # Fully connected layer 1
self.relu = nn.ReLU() # ReLU activation function
self.fc2 = nn.Linear(hidden_size, output_size) # Fully connected layer 2
self.sigmoid = nn.Sigmoid()
def forward(self, x):
out = self.fc1(x) # Apply the first fully connected layer
out = self.relu(out) # Apply the ReLU activation function
out = self.fc2(out) # Apply the second fully connected layer
out = self.sigmoid(out)
return out
# Define network parameters
input_size = 64 # Number of input features
hidden_size = 128 # Number of neurons in the hidden layer
output_size = 2 # Number of output classes
# Input data
input_data = torch.rand(32, input_size) # 32 is the batch size
target = torch.randint(0, 2, (32, output_size), dtype=torch.float32) # Random binary target values
# Create an instance of the SimpleNN model
model = Neural_Network(input_size, hidden_size, output_size)
# Define the loss function (Cross Entropy Loss) and optimizer (Adam)
criterion = nn.BCELoss()
optimizer = torch.optim.Adam(model.parameters(), lr=0.01)
# Example training loop
n_epochs = 10 # Define the number of training epochs
for epoch in range(n_epochs):
# Forward pass
outputs = model(input_data)
loss = criterion(outputs, target) # Compute the loss
# Backward pass and optimization
optimizer.zero_grad() # Clear gradients
loss.backward() # Backpropagate to compute gradients
optimizer.step() # Update the model parameters
# Print the loss for each epoch
print(f'Epoch [{epoch + 1}/{n_epochs}], Loss: {loss.item()}')

Code explanation

Line 9: We create an instance of the ReLU activation function using the nn.ReLU() command, and store it as an attribute of the SimpleNN class named self.relu.

Line 15: We apply the ReLU activation function to the output of the first fully connected layer. It replaces any negative values in the out tensor with zero, effectively introducing non-linearity into the network. This activation function helps the network learn complex patterns and is used to introduce non-linearity into the model.

For more details on how to build a neural network take a look at this Educative Answer.

Copyright ©2024 Educative, Inc. All rights reserved