What is Parametric ReLU?

Rectified Linear Unit (ReLU) is an activation function in neural networks. It is a popular choice among developers and researchers because it tackles the vanishing gradient problemThe gradients of activation functions, like sigmoid, become very small. This makes it difficult to train bigger models.. A problem with ReLU is that it returns zero for any negative value input. So, if a neuron provides negative input, it gets stuck and always outputs zero. Such a neuron is considered dead. Therefore, using ReLU may lead to a significant portion of the neural network doing nothing.

Note: You can learn more about this behavior of ReLU here.

Researchers have proposed multiple solutions to this problem. Some of them are mentioned below:

Leaky ReLU
Parametric ReLU
ELU
SELU

In this Answer, we discuss Parametric ReLU.

Parametric ReLU

The mathematical representation of Parametric ReLU is as follows:

Here, $y_i$ is the input from the $i\text{th}$ layer input to the activation function. Every layer learns the same slope parameter denoted as $\alpha_i$ . In the case of CNN, $i$ represents the number of channels. Learning the parameter, $\alpha_i$ boosts the model's accuracy without the additional computational overhead.

Note: When $\alpha_i$ is equal to zero, the function $f$ behaves like ReLU. Whereas, when $\alpha_i$ is equal to a small number (such as 0.01), the function $f$ behaves like Leaky ReLU.

The above equation can also be represented as follows:

What is Parametric ReLU?

Parametric ReLU

Parametric ReLU vs. Leaky ReLU