• Sampling from Known Distributions
  • A primer on basic sampling methods for simulations and other statistical applications.
  • Oct. 21, 2024

Sampling from a probability distribution is an essential procedure in data modelling and simulation. For example, we may want to simulate foot traffic in a business in order to observe patterns and better allocate resources. Generating samples is straightforward if we have a model of the data we intend to simulate (i.e., we know its distribution). In this post, we review some basic sampling methods.

Inverse Transform Sampling

To simplify our approach, let f be a probability density function (PDF) over the real line . If our target PDF is continuous over and its cumulative distribution function (CDF) is strictly increasing, then we have favourable conditions to apply inverse transform sampling. Define the CDF with: F(x)=xf(t) dt. Recall that if x is a real number, evaluating the CDF at x gives the probability that a random variable takes on a value less than or equal to x.

Given these assumptions, our CDF is one-to-one and onto, which means that we can uniquely define its inverse F1 at every point in [0,1]. Concretely, for any x, let F(x)=y, where y is a probability, and so we have F1(y)=x. This inverse function, also known as a quantile function, is the key to this sampling method.

Intuitively, inverse transform sampling (ITS) follows from the fact that the uniform distribution over [0,1] can be mapped (or transformed) to our target distribution via F1. That is, once we know the quantile function all we need is a way to uniformly generate samples over [0,1], then we apply the mapping to generate samples according to our target distribution.

Simulation: ITS


Enable javascript to load interactive content.

This simulation uses inverse transform sampling to generate samples according to a distribution defined by a mixture of Gaussians. A Gaussian (normal) distribution has a PDF parameterized by a mean μ and standard deviation σ given by p(x, μ, σ)=12πσ2e(xμ)22σ2, and we can define a finite mixture as a weighted (convex) combination of n Gaussians with means μ1,μ2,,μn and standard deviations σ1,σ2,,σn as f(x)=i=1nwi·p(x, μi, σi), where for each i=1,2,,n the weight wi is an element of [0,1] and we have i=1nwi=1.

Since integration is a linear operator, we can derive a CDF F for our mixture distribution as a weighted sum of Gaussian CDFs: F(x)=i=1nwi2·[1+erf(xμiσi2)], where erf is the error function.

Unfortunately, the inverse F1 of our mixture cannot be evaluated exactly, but we can use numerical methods to approximate this value. In this case, given some probability y we can use Halley's root-finding algorithm to solve for x such that F(x)y=0, which satisfies F1(y)=x; note that this approximation gets closer to the true value given more iterations of the root-finding algorithm.