Trusted answers to developer questions

What is the difference between regression and classification?

Get Started With Data Science

Learn the fundamentals of Data Science with this free course. Future-proof your career by adding Data Science skills to your toolkit — or prepare to land a job in AI, Machine Learning, or Data Analysis.

Regression and classification are two techniques used when designing machine learning algorithms. Both regression machine learning algorithms and classification machine learning algorithms are classified under the realm of supervised machine learning.

Supervised machine learning occurs when a model is trained on existing data that is correctly labeled.

The key difference between classification and regression is that classification predicts a discrete label, ​while regression predicts a continuous quantity or value.

svg viewer

Let’s consider regression and classification individually:

Regression

Regression is the process of finding a model that predicts a continuous value based on its input variables. In regression problems, the goal is to mathematically estimate a mapping function(f)(f) from the input variables(x)(x) to the output variables(y)(y).

Consider a dataset that contains information about all the students in a university. An example of a regression task would be to predict the height of any student based on their gender, weight, major, and diet. We can do this because height is a continuous quantity; i.e., there are an infinite amount of possible values for a person’s height.

A regression algorithm is commonly evaluated by calculating the root mean squared error​ of its output.

Classification

On the other hand, classification is the process of finding a model that separates input data into multiple discrete classes or labels. In other words, a classification problem determines whether or not an input value can be part of a pre-identified group.

Consider the same dataset of all the students at a university. A classification task would be to use parameters, such as a student’s weight, major, and diet, to determine whether they fall into the “Above Average” or “Below Average” category. Note that there are only two discrete labels in which the data is classified.

A classification algorithm is evaluated by computing the ​accuracy with which it correctly classified its input.

RELATED TAGS

machine learning
regression
classification
artificial intelligence
Copyright ©2024 Educative, Inc. All rights reserved
Did you find this helpful?