Trusted answers to developer questions
Trusted Answers to Developer Questions

Related Tags

# What is the perceptron learning rule?

Khizar Hayat Saani

Grokking Modern System Design Interview for Engineers & Managers

Ace your System Design Interview and take your career to the next level. Learn to handle the design of applications like Netflix, Quora, Facebook, Uber, and many more in a 45-min interview. Learn the RESHADED framework for architecting web-scale applications by determining requirements, constraints, and assumptions before diving into a step-by-step design process.

Within the domain of machine learning, a perceptron is an algorithm used for the supervised learning of binary classifiers.

## Supervised learning

Supervised learning refers to the training of a model using a labeled dataset. A labeled dataset has labeled input and output parameters.

## Binary classifiers

Binary classifiers convert the input dataset into two classes. A common example is a model that classifies images into those of cats or dogs.

Put simply, a binary classifier distinguishes between two linearly separable classes mostly represented by 1 and 0.

Linearly separable classes

## Perceptron learning rule

A perceptron takes inputs, modifies the inputs using certain weights, and then uses a function to output the final result. This is shown in the diagram below.

Perceptron learning rule

The output generated is based on the input values, $x_1, x_2,...,x_m$. The output can only have two values (binary classification), usually 1 or 0.

The summation function (represented by Σ in the above diagram) multiplies the inputs with the weights and then adds them up. This can be represented using the following equation:

$w_0 + w_1x_1 + w_2x_2 +...+w_mx_m$

## Activation function

The activation function converts the numerical output of the weighting function to 1 or 0.

The following is an example of a simple activation function:

A simple activation function

## Error and adjustments

Before we can use a perceptron to predict output, it is trained using labeled data. For each input in the training set, we compute the output. If the observed output does not match the expected output, we calculate the error.

Initially, we usually set the weights to randomly selected numbers.

We can then use the error to tweak the weights in favor of the expected output. We repeat this process until the perceptron gives a high degree of accuracy in its output.

How much we adjust our weights is controlled by the learning rate of the perceptron.

## Bias

The 1 in the input layer of the perceptron shown above is referred to as the bias. The bias allows the model to emphasize specific features to make better generalizations for the larger dataset.

## Example

We can now use the perceptron learning rule to verify the logical OR gate.

The following table represents the logical OR gate:

 x1 x2 Y 0 0 0 0 1 1 1 0 1 1 1 1

The following is our activation function:

Y = 1 if wx+b > 0
and
Y = 0 if wx+b ≤ 0

Initially, we set the weight ($w_1, w_2$) as 1. The bias is set to -1.

### First row

Let’s take the first row. After applying the net input function on the first row, we get the following:

$x_1(1) + x_2(1) - 1 = -1$

Applying the activation function for -1:

Y = 0

The observed output (0) matches the expected output (0), so there is no need to tweak the weights.

### Second row

Let’s take the second row. After applying the net input function on the second row, we get the following:

$x_1(1) + x_2(1) - 1 = 0$

Applying the activation function for 0:

Y = 0

The observed output (0) does not match the expected output (1). We need to tweak the weights.

Consider setting $w_2$ to 2. Now, when we apply the apply the net input function, we get the following:

$x_1(1) + x_2(2) - 1 = 1$

Apply the activation function for 1:

Y = 1

The outputs match.

### Third row

Now, we take the third row. After applying the net input function on the third row, we get the following:

$x_1(1) + x_2(2)-1 =0$

Applying the activation function for 0:

Y = 0

The observed output (0) does not match the expected output (1). We need to tweak the weights.

Since this case is the symmetric case for the second, we simply change $w_1$ to 2 as well.

Now, applying the net input function:

$x_1(2)+x_2(2)-1=1$

Apply the activation function for 1:

Y = 1

The outputs match.

### Fourth row

Finally, take the fourth row. After applying the net input function on the fourth row, we get the following:

$x_1(2) + x_2(2) - 1 = 3$

Applying the activation function:

Y = 1

The observed output (1) matches the expected output (1), so there is no need to tweak the weights.

Thus, we conclude that our perceptron with weights set to 2 and bias set to -1 works perfectly for the logical OR gate with two inputs.

Note: Tweaking the weights in the above example was based on intuition. However, we may also use the following formula:

$w_n = w_o + \alpha.t.x_i$

$w_n:$ new weight
$w_o$: old weight
$\alpha:$ learning rate
$t :$ expected output
$x_i:$ input

RELATED TAGS

CONTRIBUTOR

Khizar Hayat Saani 