Bayes' Theorem

Get introduced to the concept of Bayes' Theorem and calculate the posterior probability.

We'll cover the following

Calculating the posterior probability

Bayes’ Theorem describes a way of finding a conditional probability when we know certain other probabilities. The following equation mathematically denotes Bayes’ Theorem:

$P(Hypothesis|Evidence)=P(Hypothesis)\cdot\frac{P(Evidence|Hypothesis)}{P(Evidence)}$

Bayes’ Theorem says we can calculate the posterior probability from a prior probability and some evidence-related modifier.

The posterior denotes what we believe about the $Hypothesis$ after gathering the new information about the $Evidence$ . It is a conditional probability such as we discussed above. The prior probability denotes what we believed about $Hypothesis$ before we gathered the new information. It is the overall probability of our $Hypothesis$ .

The modifier of the new information denotes the relative change of our belief about $Hypothesis$ caused by the $Evidence$ .

This modifier is the quotient of the backward probability ( $P(Evidence|Hypothesis)$ ) and the probability of the new piece of information ( $P(Evidence)$ ) The backward probability, which is the numerator of the modifier, answers the question of what the probability of observing this evidence in a world where our hypothesis could be true. The denominator is the probability of observing the evidence on its own.

When we see the evidence often in a world where the hypothesis is true but rarely on its own, it seems to support the hypothesis. On the contrary, if we see the evidence everywhere but don’t see it in a world where the hypothesis is true, then the evidence opposes the hypothesis.

The farther the modifier is away from 1, the more it changes the probability. A modifier of precisely 1 would not change the probability at all. Let’s define the value of the informativeness as the modifier’s distance to 1.

$Informativeness=|\frac{P(Evidence|Hypothesis)}{P(Evidence)}-1|$

If we have one hypothesis $H$ and multiple pieces of evidence $E_1, E_2$ ,…, $E_n$ , then we have $n$ modifiers $M_1, M_2, ...,M_n$ :

$P(H|E_1,E_2,...,E_n)=\frac{P(E_1|H)}{P(E_1)}.\frac{P(E_2|H)}{P(E_2)}....\frac{P(E_n|H)}{P(E_n)}.P(H)$

What does that mean in practice?

Our $Hypothesis$ is a passenger who survived the Titanic shipwreck. We have two pieces of evidence $Female$ and $SecondClass$ .

$P(Survived)$ is the overall probability of a passenger surviving.
$P(Female)$ is the probability of a passenger to be female,
and $P(SecondClass)$ is the probability of a passenger holding a second-class ticket.
$P(Female|Survived)$ denotes how likely a passenger who survived is female.
And $P(SecondClas|Survived)$ denotes how likely a passenger who survived had a second-class ticket.

The following equation depicts how to calculate the probability of a female passenger with a second class ticket to survive:

$P(Survived|SecCl,Female)=\frac{P(SecCl|Survived)}{P(SecCl)}\cdot\frac{P(Female|Survived)}{P(Female)}\cdot P(Survived)$

Let’s have a look at the Python code.

Get hands-on with 1200+ tech skills courses.

Getting Started

Binary Classification

Qubit and Quantum States

Probabilistic Binary Classifier

Working with Qubits

Working with Multiple Qubits

Quantum Naïve Bayes

Quantum Computing Is Different

Quantum Bayesian Networks

Bayesian Inference

The World Is Not a Disk

Working with the Qubit Phase

Search for Relatives

Sampling

Conclusion

APPENDIX

Quantum Machine Learning in Python

Bayes' Theorem