What is ELU?
Exponential Linear Unit (ELU) is an activation function that improves a model's accuracy and reduces the training time. It is mathematically represented as follows:
In the formula above,
Need for ELU
The activation function ReLU became famous when it solved the problem of the vanishing gradient (namely, the gradients of activation functions, like sigmoids, become very small making it difficult to train bigger models).
At the same time, however, ReLU created a problem for itself, called the dying ReLU problem. This problem occurs when ReLU outputs 0 on any input.
In contrast, ELU (like batch normalization) has negative values that help bring the mean value closer to 0. This improves the training speed.
Even though Parametric ReLU and Leaky ReLU also have negative values, they are not smooth functions. ELU is a smooth function for negative values, making it more noise-robust.
Code
Here, we implement ELU in Python:
import numpy as npimport matplotlib.pyplot as plt# initializing the constantα = 1.0def ELU(x):if x > 0:return xreturn α*(np.exp(x) - 1)x = np.linspace(-5.0, 5.0)result = []for i in x:result.append(ELU(i))plt.plot(x, result)plt.title("ELU activation function")plt.xlabel("Input")plt.ylabel("Output")plt.grid(True)plt.savefig('output/elu_plot.png')
Code explanation
- Lines 9–10: We implement the equation mentioned above.
- Line 12: We use
np.linspaceto generate evenly spaced numbers betweenand . By default, it generates a total of numbers. - Lines 17–22: We use the matplotlib library to plot the output of ELU over the given range.
Free Resources