An Introductory Guide to Data Science and Machine Learning/

...

Probability Distributions (Binomial and Bernoulli Distributions)

Knowledge of Probability Distributions is essential in the Data Science field and it is the backbone for understanding many concepts. You'll learn about it in this lesson.

We'll cover the following...

What are probability distributions ?
- Example
  - Probability Histogram
Types of random variables
Bernoulli Distribution
- Example
Binomial Distribution
- Example

What are probability distributions ?

Probability distribution is a summary of probabilities associated with all the possible outcomes of a random variable X. Probability Distributions have a particular shape that has properties like Mean(Expected Value), Variance, Skewness and Kurtosis.

Example

When we roll a fair dice, the probability of each outcome is equally likely, meaning $\frac{1}{6}$ . We can write it as seen below.

Outcome of Rolling a Dice	Probability
1	1/6
2	1/6
3	1/6
4	1/6
5	1/6
6	1/6

Probability Histogram

If we take the dice roll outcome on the x-axis and its probability on the y-axis, we get the below graph representation.

Types of random variables

Discrete random variables

Discrete random variables are the variables whose outcomes take on a discrete set of values. The function that calculates the probability of each outcome (Probability Distribution) of a Discrete Random Variable is called the Probability Mass Function.

Example

Let X be the random variable denoting the sum of two dice. When two dice are rolled, the possible outcomes on the face of the dices are (1,1), (1,2), (1,3), (1,4) , (1,5), (1,6), (2,1), (2,2) and so on. So the total outcomes are 36, among which the possible sum on two dices are listed below along with their probability.

X (the sum of two dice)	Probability
2	1/36
3	2/36
4	3/36
5	4/36
6	5/36
7	6/36
8	5/36
9	4/36
10	3/36
11	2/36
12	1/36

Probability Histogram

If we take the sum of the two dice on the x-axis and their probability on the y-axis, we get the below graph representation.

Continuous Random Variables

The variable whose outcomes can take on a real value. The Probability Density Function (PDF) defines the probability distribution of a Continuous Random Variable. Notice that for Discrete Random Variable it was called Probability Mass Function.

A continuous random variable has a probability of zero of assuming exactly any of its values. Consequently, its probability distribution cannot be given in tabular form.

The probability distribution of a Continuous Random Variable is shown by a density curve. The probability that X is between an interval of numbers is the area under the density curve between the interval endpoints. One of the most commonly used Continuous Probability Distribution is Gaussian or Normal Distribution.

Bernoulli Distribution

Bernoulli Distribution is a Discrete Probability Distribution. Bernoulli Distribution consists of only two outcomes: 1 (success) and 0 (failure). The probability of success is denoted by “p”, and the probability of failure is denoted by “q” or “1-p”.

The Probability Mass Function which calculates the probability of each outcome is given as

$P(x)$ = $p^x$ $q^{1-x}$ = $p^x$ $(1-p)^{1-x}$ ...

What is Data Science ?

Applications of Data Science

Overview of Libraries

Probability and Statistics

Machine Learning Part-1

Machine Learning Part-2

Machine Learning Part-3

Deep Learning

Machine Learning Tools and Libraries

Big Data Tools and Technologies

Where to go next ?

Probability Distributions (Binomial and Bernoulli Distributions)

What are probability distributions ?

Example

Probability Histogram

Types of random variables

Discrete random variables

Example

Probability Histogram

Continuous Random Variables

Bernoulli Distribution