Supervised vs. unsupervised vs. reinforcement learning

Key takeaways:

Supervised, unsupervised, and reinforcement learning represent the three major categories of machine learning (ML) techniques.
Supervised learning solves static problems with labeled datasets, unsupervised learning reveals insights from unstructured data, and reinforcement learning tackles dynamic environments where actions impact future rewards.
Supervised learning maps input to output, unsupervised learning groups inputs based on similarity, and reinforcement learning focuses on finding the best actions to maximize cumulative rewards over time.
Supervised learning algorithms, such as decision trees and linear regression, excel in prediction tasks where historical data is available.
Unsupervised learning algorithms, such as k-means clustering and hierarchical clustering, excel in identifying hidden patterns and natural groupings within data.
Reinforcement learning algorithms, like Q-learning and Deep Q-Networks, excel in environments where decisions affect long-term rewards. They learn through trial and error while addressing the exploration vs. exploitation dilemma.
The choice between supervised, unsupervised, and reinforcement learning depends on the availability of labeled data, the type of problem, and the learning environment.

Machine learning (ML) encompasses various techniques, each with unique approaches to solving different types of problems. Supervised, unsupervised, and reinforcement learning represent the three major categories. While supervised learning relies on labeled data to make predictions, unsupervised learning uncovers hidden patterns without labels, and reinforcement learning teaches agents to make decisions through trial and error. In this Answer, understand the differences between these methods that are crucial to selecting the right approach for specific tasks.

Supervised learning

In supervised learning, the AI model is trained based on the given input and its expected output, i.e., the label of the input. The model creates a mapping equation based on the inputs and outputs and predicts the label of the inputs in the future based on that mapping equation.

Example

Suppose we have to develop a model that differentiates between a cat and a dog. To train the model, we feed multiple images of cats and dogs into it with a label indicating whether the image is of a cat or a dog. The model tries to develop an equation between the input images and their labels. After training, the model can predict whether an image is of a cat or a dog, even if the image was previously unseen by the model.

Applications

Supervised machine learning primarily addresses regression and classification problems. Some examples of supervised machine learning applications are detecting whether a piece of news is real or fake and predicting whether the cancer tumors are malignant or benign.

Try out the “Fake News Detection Using scikit-learn” project to get hands-on practice with supervised learning.

Code example

Let’s see the code example for classifying the images using Keras. The following code trains the model with limited images (taken from the “Intel Image Classification” dataset) of seas and buildings. We’ll provide it with an unseen image from one of the two categories to see how well it predicts that image.

import tensorflow as tf
from tensorflow.keras.preprocessing import image
from tensorflow.keras.preprocessing.image import ImageDataGenerator
import numpy as np
import matplotlib.pyplot as plt
import base64

imageSize = (250, 250)
batchSize = 20

trainDirectory = 'archive/seg_train/seg_train'
testDirectory = 'archive/seg_test/seg_test'

generateTrainingData = ImageDataGenerator(
    rescale=1./255,
    rotation_range=25,
    width_shift_range=0.1,
    height_shift_range=0.1,
    shear_range=0.1,
    zoom_range=0.1,
    horizontal_flip=True,
    fill_mode='nearest'
)

trainDataset = generateTrainingData.flow_from_directory(
    trainDirectory,
    seed=594,
    target_size=imageSize,
    batch_size=batchSize,
    class_mode='sparse'
)

validationDataset = tf.keras.utils.image_dataset_from_directory(
    testDirectory,
    seed=594,
    image_size=imageSize,
    batch_size=batchSize
)

classNames = list(trainDataset.class_indices.keys())
classCount = len(classNames)

model = tf.keras.Sequential([
    tf.keras.layers.Conv2D(20, 3, activation='relu', input_shape=(imageSize[0], imageSize[1], 3)),
    tf.keras.layers.MaxPooling2D(),
    tf.keras.layers.Conv2D(40, 3, activation='relu'),
    tf.keras.layers.MaxPooling2D(),
    tf.keras.layers.Conv2D(80, 3, activation='relu'),
    tf.keras.layers.MaxPooling2D(),
    tf.keras.layers.Flatten(),
    tf.keras.layers.Dense(80, activation='relu'),
    tf.keras.layers.Dense(classCount)
])

model.compile(
    optimizer='adam',
    loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
    metrics=['accuracy']
)

history = model.fit(
    trainDataset,
    validation_data=validationDataset,
    epochs=15
)

img = image.load_img('19763.jpg', target_size=imageSize)
imgArray = image.img_to_array(img)
imgArray = np.expand_dims(imgArray, axis=0)
imgArray = imgArray / 255.0

predictions = model.predict(imgArray)
predictedClassIndex = np.argmax(predictions)
predictedClass = classNames[predictedClassIndex]

plt.imshow(imgArray[0])
plt.title(predictedClass)
plt.savefig('output.png')

html = f'''
<html>
<body>
<h1>Predicted Class: {predictedClass}</h1>
<img src="data:image/png;base64,{base64.b64encode(open('output.png', 'rb').read()).decode('utf-8')}" alt="Output">
</body>
</html>
'''

with open('output.html', 'w') as file:
    file.write(html)

A code example of supervised learning

Unsupervised learning

In unsupervised learning, the AI model is trained only on the inputs without their labels. The model classifies the input data into classes with similar features. Based on the similarity of its features with one of the classes, the input’s label is then predicted in the future.

Example

Suppose we have a collection of red and blue balls, and we have to classify them into two classes. Let’s say all other features of the balls are the same except for their color. The model tries to find the dissimilar features between the balls on the basis of how the model can classify the balls into two classes. After the balls are classified into two classes depending on their color, we get two clusters of balls, one of blue color and one of red color.

Applications

Unsupervised machine learning is ideal for clustering and associative rule mining tasks, allowing for the identification of hidden patterns in data without relying on predefined labels.

The notable applications of unsupervised machine learning can be customer segmentation, to help businesses tailor their marketing efforts and personalized recommendation systems, like the ones in Netflix or Amazon that suggest personalized content to users based on their preferences.

Want to explore the concepts of unsupervised learning with a real-world application? Try out the “Customer Segmentation with K-Means Clustering” project.

Code example

Let’s see the code example of clustering in Python using the DBSCAN algorithm.

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.datasets import make_classification
from sklearn.cluster import DBSCAN
# Create a random dataset with 1000 samples and 2 features
X, _= make_classification(
    n_samples=1000,
    n_features=2,
    n_informative=2,
    n_redundant=0,
    n_clusters_per_class=1,
    random_state=4
)
df = pd.DataFrame(X)
print(df.shape)
# # Define the model
dbscan_model = DBSCAN(eps=0.35,min_samples=16)
# # Train the model
dbscan_model.fit(df)
# #Visualize the clusters
plt.figure(figsize=(10,10))
plt.scatter(df[0],df[1],c = dbscan_model.labels_,s=15)
plt.title('DBSCAN Clustering',fontsize=20)
plt.xlabel('Feature 1',fontsize=14)
plt.ylabel('Feature 2',fontsize=14)
plt.show()

Reinforcement learning

In reinforcement learning, the AI model tries to take the best possible action in a given situation to maximize the total profit. The model learns by getting feedback on its past outcomes.

Example

Consider the example of a robot that is asked to choose a path between A and B. In the beginning, the robot chooses either of the paths as it has no past experience. The robot is given feedback on the path it chooses and learns from this feedback. The next time the robot gets into a similar situation, it can use feedback to solve the problem. For example, if the robot chooses path B and gets a reward, i.e., positive feedback, this time the robot knows that it has to choose path B to maximize its reward.

Applications

Reinforcement learning is used to solve exploration and exploitation problems. In this method, an agent learns to make decisions through trial and error to maximize rewards.

Its applications are widespread, from robotics to autonomous driving to healthcare, and the list goes on. Some examples of its applications can be creating an autonomous driving experience or training a two-legged robot to walk without falling.

Get hands-on with the concepts of reinforcement learning with the “Train an Agent to Self-Drive a Taxi Using Reinforcement Learning” project.

Simulated example

Here is a sample simulation to see how reinforcement learning gets feedback and learns policy.

Comparison between supervised, unsupervised, and reinforcement learning

Criteria	Supervised Learning	Unsupervised Learning	Reinforcement Learning
Input Data	Input data is labelled.	Input data is not labelled.	Input data is not predefined.
Problem	Learn pattern of inputs and their labels.	Divide data into classes.	Find the best reward between a start and an end state.
Solution	Finds a mapping equation on input data and its labels.	Finds similar features in input data to classify it into classes.	Maximizes reward by assessing the results of state-action pairs
Model Building	Model is built and trained prior to testing.	Model is built and trained prior to testing.	The model is trained and tested simultaneously.
Applications	Deals with regression and classification problems.	Deals with clustering and associative rule mining problems.	Deals with exploration and exploitation problems.
Algorithms Used	Decision trees, linear regression, K-nearest neighbors	K-means clustering, k-medoids clustering, agglomerative clustering	Q-learning, SARSA, Deep Q Network
Examples	Image detection, Population growth prediction	Customer segmentation, feature elicitation, targeted marketing, etc	Drive-less cars, self-navigating vacuum cleaners, etc

Conclusion

Supervised, unsupervised, and reinforcement learning all have important roles in machine learning. Supervised learning works best when you have labeled data, helping to make accurate predictions. Unsupervised learning helps find hidden patterns in data that aren’t labeled. Reinforcement learning is great for situations where an agent learns the best actions through trial and error in changing environments. In this Answer, we understood the strengths of each type and learned how to tackle real-life problems, from sorting and grouping data to making smart decisions.

Frequently asked questions

Haven’t found what you were looking for? Contact Us

What problem does reinforcement learning solve?

Reinforcement learning solves problems where an agent needs to learn how to make the best decisions to maximize rewards through trial and error in an uncertain environment.

Why you might want to use reinforcement learning instead of unsupervised learning?

Prefer reinforcement learning over unsupervised learning when you need an agent to make decisions and learn from its actions over time, rather than just finding patterns in data without specific goals.

Why should I use reinforcement learning over supervised learning?

Reinforcement learning is useful when you want to teach a model to make choices in dynamic situations, where the correct actions are not known beforehand and depend on feedback from the environment rather than just learning from labeled examples.

How to choose between supervised vs. unsupervised vs. reinforcement learning

Choose supervised learning when you have labeled data and a clear goal, unsupervised learning when you want to explore patterns in unlabeled data, and reinforcement learning when you need to teach an agent to make decisions based on feedback from its actions.

When should we not use RL?

Reinforcement learning is not ideal when you have a small dataset or when the environment is static and well-defined, as traditional methods might be simpler and more efficient. It’s also not suitable for problems where clear reward signals or feedback are lacking, making it hard for the agent to learn effectively.

Free AI Mock Interviews

Coding Interview

Coding PatternsFree Interview

Gain insights and practical experience with coding patterns through targeted MCQs and coding problems, designed to match and challenge your expertise level.

System Design

YouTubeFree Interview

Learn to design a video streaming platform like YouTube by tackling functional and non-functional requirements, core components, and high-level to detailed design challenges.

Free Resources