Grokking Modern System Design Interview for Engineers & Managers
Ace your System Design Interview and take your career to the next level. Learn to handle the design of applications like Netflix, Quora, Facebook, Uber, and many more in a 45-min interview. Learn the RESHADED framework for architecting web-scale applications by determining requirements, constraints, and assumptions before diving into a step-by-step design process.
Within the domain of machine learning, a perceptron is an algorithm used for the supervised learning of binary classifiers.
Supervised learning refers to the training of a model using a labeled dataset. A labeled dataset has labeled input and output parameters.
Binary classifiers convert the input dataset into two classes. A common example is a model that classifies images into those of cats or dogs.
Put simply, a binary classifier distinguishes between two linearly separable classes mostly represented by 1
and 0
.
A perceptron takes inputs, modifies the inputs using certain weights, and then uses a function to output the final result. This is shown in the diagram below.
The output generated is based on the input values, $x_1, x_2,...,x_m$. The output can only have two values (binary classification), usually 1
or 0
.
The summation function (represented by Σ in the above diagram) multiplies the inputs with the weights and then adds them up. This can be represented using the following equation:
$w_0 + w_1x_1 + w_2x_2 +...+w_mx_m$
The activation function converts the numerical output of the weighting function to 1
or 0
.
The following is an example of a simple activation function:
Before we can use a perceptron to predict output, it is trained using labeled data. For each input in the training set, we compute the output. If the observed output does not match the expected output, we calculate the error.
Initially, we usually set the weights to randomly selected numbers.
We can then use the error to tweak the weights in favor of the expected output. We repeat this process until the perceptron gives a high degree of accuracy in its output.
How much we adjust our weights is controlled by the learning rate of the perceptron.
The 1
in the input layer of the perceptron shown above is referred to as the bias. The bias allows the model to emphasize specific features to make better generalizations for the larger dataset.
We can now use the perceptron learning rule to verify the logical OR
gate.
The following table represents the logical OR
gate:
x_{1} | x_{2} | Y |
0 | 0 | 0 |
0 | 1 | 1 |
1 | 0 | 1 |
1 | 1 | 1 |
The following is our activation function:
Y = 1 if wx+b > 0
and
Y = 0 if wx+b ≤ 0
Initially, we set the weight ($w_1, w_2$) as 1. The bias is set to -1.
Let’s take the first row. After applying the net input function on the first row, we get the following:
$x_1(1) + x_2(1) - 1 = -1$
Applying the activation function for -1:
Y = 0
The observed output (0) matches the expected output (0), so there is no need to tweak the weights.
Let’s take the second row. After applying the net input function on the second row, we get the following:
$x_1(1) + x_2(1) - 1 = 0$
Applying the activation function for 0:
Y = 0
The observed output (0) does not match the expected output (1). We need to tweak the weights.
Consider setting $w_2$ to 2. Now, when we apply the apply the net input function, we get the following:
$x_1(1) + x_2(2) - 1 = 1$
Apply the activation function for 1:
Y = 1
The outputs match.
Now, we take the third row. After applying the net input function on the third row, we get the following:
$x_1(1) + x_2(2)-1 =0$
Applying the activation function for 0:
Y = 0
The observed output (0) does not match the expected output (1). We need to tweak the weights.
Since this case is the symmetric case for the second, we simply change $w_1$ to 2 as well.
Now, applying the net input function:
$x_1(2)+x_2(2)-1=1$
Apply the activation function for 1:
Y = 1
The outputs match.
Finally, take the fourth row. After applying the net input function on the fourth row, we get the following:
$x_1(2) + x_2(2) - 1 = 3$
Applying the activation function:
Y = 1
The observed output (1) matches the expected output (1), so there is no need to tweak the weights.
Thus, we conclude that our perceptron with weights set to 2 and bias set to -1 works perfectly for the logical OR
gate with two inputs.
Note: Tweaking the weights in the above example was based on intuition. However, we may also use the following formula:
$w_n = w_o + \alpha.t.x_i$
$w_n:$ new weight
$w_o$: old weight
$\alpha:$ learning rate
$t :$ expected output
$x_i:$ input
RELATED TAGS
CONTRIBUTOR
Grokking Modern System Design Interview for Engineers & Managers
Ace your System Design Interview and take your career to the next level. Learn to handle the design of applications like Netflix, Quora, Facebook, Uber, and many more in a 45-min interview. Learn the RESHADED framework for architecting web-scale applications by determining requirements, constraints, and assumptions before diving into a step-by-step design process.