Activation Functions are essential parts of neural networks. In this shot, we will be reviewing how neural networks operate to understand Activation Functions and we’ll go through some types of Activation Functions.
Neural networks, also called artificial neural networks (ANNs), are a machine learning subset central to deep learning algorithms.
As their name suggests, they mimic how the human brain learns. The brain gets stimuli from the external environment, processes the information, and then provides an output. As the task becomes more difficult, numerous neurons form a complex network that communicates with one another.
The image above is a neural network with interconnected neurons. Each neuron is defined by its weight, bias, and activation function.
x = $\sum(weight * input) + bias$
Activation functions are essential components of neural networks because they introduce non-linearity. A neural network would be a linear regressor without activation functions. They determine whether a neuron should be fired. A non-linear transformation is applied to the input before it is sent to the next layer of neurons. Otherwise, the output is finalised.
x = Activation$\sum(weight * input) + bias$
Simply put, activation functions are like sensors that will trigger your brain neurons to recognize when you smell something pleasant or unpleasant.
The non-linear nature of most activation functions is intentional. Neural networks can compute arbitrarily complex functions using non-linear activation functions.
Activation Functions can be categorized into Linear and Non-Linear Activation Functions.
Linear Activation Functions are linear, so the output of the functions will not be confined between any range. They don’t affect the complexity of data that is fed to the neural networks.
Non-linear Activation Functions are the most popular because they allow the model to generalise or adapt to a wide range of variables while still distinguishing between the output.
Below is a brief overview of 3 common Activation Functions:
Here’s the mathematical expression for sigmoids: $f_{x}$ = $\frac{1}{(1+e^(-x) - 1)}$
Here’s the mathematical expression for ReLU: $f_{x}$=max(0,x), if x <= 0, otherwise x
Here’s the mathematical expression for ReLU: $f_{x}$ = $\frac{2}{(1+e^(-2x) - 1)}$ or $f_{x}$ = $\frac{(e^x - e^-x)}{(e^x + e^-x)}$