Techniques for Sampling
In the previous lesson, we went through a loose, hand-wavy definition of what it means to have a “weighted” continuous distribution: our weights are now doubles, and given by a Probability Distribution Function (PDF); the probability of a sample coming from any particular range is the area under the curve of that range, divided by the total area under the function. (Which need not be ).
Sample() to Conform Samples to the Distribution
A central question for the rest of this course will be this problem; suppose we have a delegate that implements a non-normalized PDF; can we implement
Sample() such that the samples conform to the distribution?
The short answer is; in general, no.
A delegate from double to double that is defined over the entire range of doubles has well over a billion possible inputs and outputs. Consider for example the function that has a high-probability lump in the neighborhood of and another one at and is zero everywhere else; if you know nothing about the function, how would you know to look there?
We need to know something about the PDF to implement
The long answer is; if we can make a few assumptions then sometimes we can do a pretty good job.
As we’ve mentioned before in this course: if we know the quantile function associated with the PDF then we can very quickly sample from the distribution. We just sample from a standard continuous uniform distribution, call the quantile function with the sampled value, and the value returned is our sample. Let’s assume that we do not know the quantile function of our PDF.
Let’s look at an example. Suppose we have this weight function:
Func<double, double> Mixture = x =>
Exp(–x * x) + Exp((1.0 – x) * (x – 1.0) * 10.0);
If we graph that out, it looks like this: