What is SELU?
Introduction
Where
Problem with ReLU
The reason why ReLU was famous is that it solved the problem of vanishing gradient. Nevertheless, ReLU created a problem for itself called the dying ReLU problem. This problem occurs when ReLU outputs 0 on any input.
To counter this, we require our activation function to output values other than zero, so the gradient persists. SELU is one of those functions.
Implementation with Python
import numpy as npimport matplotlib.pyplot as plt# initializing the constantsλ = 1.0507α = 1.6732def SELU(x):if x > 0:return λ*xreturn λ*α*(np.exp(x) - 1)x = np.linspace(-5.0, 5.0)result = []for i in x:result.append(SELU(i))plt.plot(x, result)plt.title("SELU activation function")plt.xlabel("Input")plt.ylabel("Output")plt.grid(True)plt.savefig('output/selu_plot.png')
Explanation
- Line 10–11: These lines implement the equation mentioned above.
- Line 13: Here, we use
np.linspaceto generate evenly spaced numbers between –5.0 to 5.0. By default, it generates a total of 50 numbers. - Line 18–23: Here, we use the
matplotliblibrary to plot the output of SELU over the given range.
Free Resources