Now, we’ll discuss the terminology related to probability theory. Probability theory is a vital part of machine learning because modeling data with probabilistic models allows us to draw conclusions about how uncertain a model is about some predictions. Consider a use case of sentiment analysis. We want to output a prediction (positive or negative) for a given movie review. Though the model outputs some value between 0 and 1 (0 for negative and 1 for positive) for any sample we input, the model doesn’t know how uncertain it is about its answer.

Let’s see how uncertainty helps us to make better predictions. For example, a deterministic model (i.e., a model that outputs an exact value instead of a distribution for the value) might incorrectly say the positivity of the review “I never lost interest” is 0.25 (that is, it’s more likely to be a negative comment). However, a probabilistic model will give a mean value and a standard deviation for the prediction. For example, it will say this prediction has a mean of 0.25 and a standard deviation of 0.5. With the second model, we know that the prediction is likely to be wrong due to the high standard deviation. However, in the deterministic model, we don’t have this luxury. This property is especially valuable for critical machine systems.

To develop such probabilistic machine learning models (for example, Bayesian logistic regression, Bayesian neural networks, or Gaussian processes), we should be familiar with basic probability theory. Therefore, we’ll provide some basic probability information here.

Random variables

A random variable is a variable that can take some value at random. Also, random variables are represented as x1x_1, x2x_2, and so on. Random variables can be of two types: discrete and continuous.

Discrete random variables

A discrete random variable is a variable that can take discrete random values. For example, trials of flipping a coin can be modeled as a random variable; that is, the side a coin lands on when we flip it is a discrete variable, as the value can only be heads or tails. Alternatively, the value we get when we roll a die is discrete as well because the values can only come from the set {1,2,3,4,5,6}.\{1,2,3,4,5,6\}.

Continuous random variables

A continuous random variable is a variable that can take any real value, that is, if xx is a continuous random variable:

Get hands-on with 1200+ tech skills courses.