For anyone taking their first steps in data science, probability is a must-know concept. In this lesson, we will learn this important piece of the puzzle by going through each concept in a simple way.

What Is Probability?

Probability is the numerical chance that something will happen; it tells us how likely it is that some event will occur.

Probability is one of those intuitive concepts that we use on a daily basis, without necessarily realizing that we are talking probability. Our lives are full of uncertainties; unless someone has superpowers to foresee the future, we don’t know the outcomes of a particular situation or event until it actually happens. Will I pass the exam with flying colors? Will it snow today? Will my favorite team win the match? These are some examples of uncertain events. In statistical terms, “team won” is the outcome while “my team winning today’s match” is the event. Probability is the measure of how likely this outcome is.

For example, if it is 80% likely that my team will win today, the probability of the outcome “the team won” for today’s match is 0.8; while the probability of the opposite outcome, “it lost”, is 0.2, i.e., 1 - 0.8. Probability is represented as a number between 0 and 1, where 0 indicates impossibility and 1 indicates certainty.

Why Is Probability Important?

With all the uncertainty and randomness that occurs in our daily life, probability helps us make sense of these uncertainties. It helps us understand the chances of various events. This, in turn, means that we can make informed decisions based on estimates or patterns of data collected previously. For example, if it is likely to rain, we can grab an umbrella before heading out. Or if a user is unlikely to check our app without a reminder, we can send them a notification.

How Does Probability Fit in Data Science?

Understanding the methods and models needed for data science, like logistic regression which we will encounter in the Machine Learning section, randomization in A/B testing, or experimental design, and sampling of data are examples of use-cases that require a good understanding of probability.

Calculating Probability of Events

Probability is a type of ratio where we compare how many times an outcome can occur compared to all possible outcomes. Simply put:

Get hands-on with 1200+ tech skills courses.