In supervised learning, the AI model is trained based on the given input and its expected output, i.e., the label of the input. The model creates a mapping equation based on the inputs and outputs and predicts the label of the inputs in the future based on that mapping equation.
Let’s suppose we have to develop a model that differentiates between a cat and a dog. To train the model, we feed multiple images of cats and dogs into the model with a label indicating whether the image is of a cat or a dog. The model tries to develop an equation between the input images and their labels. After training, the model can predict whether an image is of a cat or a dog even if the image is previously unseen by the model.
In unsupervised learning, the AI model is trained only on the inputs, without their labels. The model classifies the input data into classes that have similar features. The label of the input is then predicted in the future based on the similarity of its features with one of the classes.
Suppose we have a collection of red and blue balls and we have to classify them into two classes. Let’s say all other features of the balls are the same except for their color. The model tries to find the dissimilar features between the balls on the basis of how the model can classify the balls into two classes. After the balls are classified into two classes depending on their color, we get two clusters of balls, one of blue color and one of red color.
In reinforcement learning, the AI model tries to take the best possible action in a given situation to maximize the total profit. The model learns by getting feedback on its past outcomes.
Consider the example of a robot that is asked to choose a path between A
and B
. In the beginning, the robot chooses either of the paths as it has no past experience. The robot is given feedback on the path it chooses and learns from this feedback. The next time the robot gets into a similar situation, it can use feedback to solve the problem. For example, if the robot chooses path B
and gets a reward, i.e., positive feedback, this time the robot knows that it has to choose path B
to maximize its reward.
Criteria | Supervised Learning | Unsupervised Learning | Reinforcement Learning |
Input Data | Input data is labelled. | Input data is not labelled. | Input data is not predefined. |
Problem | Learn pattern of inputs and their labels. | Divide data into classes. | Find the best reward between a start and an end state. |
Solution | Finds a mapping equation on input data and its labels. | Finds similar features in input data to classify it into classes. | Maximizes reward by assessing the results of state-action pairs |
Model Building | Model is built and trained prior to testing. | Model is built and trained prior to testing. | The model is trained and tested simultaneously. |
Applications | Deal with regression and classification problems. | Deals with clustering and associative rule mining problems. | Deals with exploration and exploitation problems. |
Algorithms Used | Decision trees, linear regression, K-nearest neighbors | K-means clustering, k-medoids clustering, agglomerative clustering | Q-learning, SARSA, Deep Q Network |
Examples | Image detection, Population growth prediction | Customer segmentation, feature elicitation, targeted marketing, etc | Drive-less cars, self-navigating vacuum cleaners, etc |
RELATED TAGS
CONTRIBUTOR
View all Courses