Search⌘ K

Non-uniform Discrete Distribution (Continued)

Learn how to implement advanced sampling algorithms for non-uniform discrete distributions in C#. Understand how rejection sampling works, its limitations, and prepare to explore the alias method for improved efficiency in weighted random selection.

In the previous lesson, we sketched out an O(logO(log n)n) algorithm for sampling from a weighted distribution of integers by implementing a discrete variation on the “inverse transform” method where the “inverse” step involves a binary search.


Modification of Non-Uniform Discrete Distribution

Can we do better than O(logn)O(log n)? Yes — but we’ll need an entirely different method. We will sketch out two methods, and implement one this time, and the other in the next lesson.

Again, let’s suppose we have some weights, say 1010, 1111, 55. We have three possible results, and the highest weight is 1111. Let’s construct a 33 by 1111 rectangle that looks like our ideal histogram; we’ll put dashes to indicate the “spaces”:

0|**********-
1|***********
2|*****------

Here’s an algorithm for sampling from this distribution:

  1. Uniformly choose a random row and column in the rectangle; it’s easy to choose a number from 00 to 22 for the row and a number from 00 to 1010 for the column.
  2. If that point is a *, then the sample is the generated row number.
  3. If the point is a , try again.

Basically, we’re throwing darts at the rectangle, and the likelihood of hitting a valid point on a particular row is proportional to the probability of that row.

In case that’s not clear, let’s work out the probabilities. To throw our dart, first we’ll pick a row, uniformly, and then we’ll pick a point in that row, uniformly. We have a 13×1011=1033\frac{1}{3} \times \frac{10}{11} = \frac{10}{33} ...