Trusted answers to developer questions
Trusted Answers to Developer Questions

Related Tags

# What is learning rate and what role does it play?

Samia Ishaque

Grokking Modern System Design Interview for Engineers & Managers

Ace your System Design Interview and take your career to the next level. Learn to handle the design of applications like Netflix, Quora, Facebook, Uber, and many more in a 45-min interview. Learn the RESHADED framework for architecting web-scale applications by determining requirements, constraints, and assumptions before diving into a step-by-step design process.

## Learning rate

Learning rate is a small number ranging between 0.0 and 1.0 that is used to train a neural network. The model needs to learn to minimize loss, which is the error between its predicated value for an input and the real label for that input. To do so, the modelcode that has been trained to identify patterns or make predictions must update its weightsparameters in a neural network that help transform input data to its labels after every epochone cycle of training the neural network with the training data in response to the loss. The extent to which weights are changed is called the learning rate.

The formula for updating weights is as follows:

$newWeight = oldWeight - learningRate*gradientOfLossFunction$

As shown, the weights are updated to reach their optimum value ﻿with every epoch, which ﻿determines how fast the model learns and adapts. If one chooses to use a value too big, the model will train faster, but it may arrive at values that are not optimalbest or favorable. On the other hand, choosing a value too small would result in more time taken to train the model. Hence, it is crucial to use a learning rate that is just right for your model.

Moreover, the learning rate is known as a hyperparameter because we set it ourselves for each model. Generally, the $\alpha$ symbol is used to represent the learning rate.

## Tuning the learning rate

The optimal learning rate is determined through trial and error; this is because it is not possible to predict or analyze what learning rate will fit a model beforehand. It is advised to start with values 0.1 or 0.01. After trying different learning rates, one can analyze how fast or slow a model is learning over the epochs.

RELATED TAGS

CONTRIBUTOR

Samia Ishaque