Solve Problems in Bayesian Inference

In this lesson, we will review concepts learned so far and learn how to solve problems using Bayesian Inference.

Review of the Concepts Learned So Far

Since we have covered many new concepts this would be a good time to quickly review where we’re at:

  1. We’re representing a particular discrete probability distribution P(A)P(A) over a small number of members of a particular type AA by IDiscreteDistribution<A>.
  2. We can condition a distribution — by discarding certain possibilities from it — with Where.
  3. We can project a distribution from one type to another with Select.
  4. A conditional probability P(BA)P(B|A) — the probability of BB given that some AA is true — is represented as likelihood function of type Func<A, IDiscreteDistribution<B>>.
  5. We can “bind” a likelihood function onto a prior distribution with SelectMany to produce a joint distribution.

These are all good results and we hope you agree that we have already produced a much richer and more powerful abstraction over randomness than System.Random provides.

Bayes’ Theorem

In this lesson, everything is really going to come together to reveal that we can use these tools to solve interesting problems in probabilistic inference.

To show how we’ll need to start by reviewing Bayes’ Theorem.

If we have a prior P(A)P(A), and a likelihood P(BA)P(B|A), we know that we can “bind” them together to form the joint distribution. That is, the probability of AA and BB both happening is the probability of AA multiplied by the probability that BB happens given that AA has happened:

P(A&B)=P(A)×P(BA)P(A\&B) = P(A) \times P(B|A)

Obviously, that goes the other way. If we have P(B)P(B) as our prior, and P(AB)P(A|B) as our likelihood, then:

P(B&A)=P(B)×P(AB)P(B\&A) = P(B) \times P(A|B)

But (A&B)(A\&B) is the same as (B&A)(B\&A), and things equal to the same are equal to each other. Therefore:

P(A)×P(BA)=P(B)×P(AB)P(A) \times P(B|A) = P(B) \times P(A|B)

Let’s suppose that P(A)P(A) is our prior and P(BA)P(B|A) is our likelihood. In the equation above the term P(AB)P(A|B) is called the posterior and can be computed like this:

P(AB)=P(A)×P(BA)÷P(B)P(A|B) = P(A) \times P(B|A) \div P(B)

Let’s move away from abstract mathematics and illustrate an example by using the code we’ve written so far.

We can step back a few lessons and re-examine our prior and likelihood example for Frob Syndrome. Recall that this was a made-up study of a made-up condition which we believe may be linked to height. We’ll use the weights from the original episode.

That is to say: we have P(Height)P(Height), we have likelihood function P(SeverityHeight)P(Severity|Height), and we wish to first compute the joint probability distribution P(HeightP(Height&Severity)Severity):

var heights = new List<Height() { Tall, Medium, Short }
var prior = heights.ToWeighted(5, 2, 1);
IDiscreteDistribution<Severity> likelihood(Height h)
    case Tall: return severity.ToWeighted(10, 11, 0);
    case Medium: return severity.ToWeighted(0, 12, 5);
    default: return severity.ToWeighted(0, 0, 1);

var joint = prior.Joint(likelihood);      

This produces:

(Tall, Severe):850
(Tall, Moderate):935
(Medium, Moderate):504
(Medium, Mild):210
(Short, Mild):357

Now the question is: what is the posterior, P(HeightSeverity)P(Height|Severity)?

Get hands-on with 1000+ tech skills courses.