Search⌘ K
AI Features

Diffusion Models

Explore diffusion models in generative AI to understand their forward and reverse processes, training stability, and how they compare to GANs and VAEs. This lesson prepares you to explain their core mechanics, practical applications like Stable Diffusion, and recent advances in accelerated sampling crucial for AI engineer interviews.

If you are interviewing for a generative AI role, diffusion models are almost guaranteed to come up, since recent breakthroughs such as Stable Diffusion and DALL•E have pushed them to the center of modern generative modeling. Interviewers often ask candidates to explain diffusion models and contrast them with GANs and VAEs to see whether they understand both current techniques and the fundamentals behind older frameworks.

The question is a proxy for assessing how well you grasp the evolution of generative models and the reasons newer approaches have gained traction. In this lesson, we will clarify diffusion models, explain why they have become so relevant, describe how they work, and compare diffusion, GANs, and VAEs.

What is a diffusion model and how does it generate data?

In generative AI, a diffusion model is a type of model that generates data by progressively denoising random noise to create a sample. That’s a mouthful, so let’s unpack it. The term “diffusion” comes from the idea of particles diffusing (spreading out) in a medium—imagine ink spreading in water, eventually turning the water uniformly colored. In the context of AI, we simulate a diffusion-like process on data: we start with a piece of data (say, an image) and gradually add more and more noise until it becomes pure noise. This is called the forward diffusion process, and it’s a bit like destroying the image step by step.

Educative byte: The theoretical foundations of diffusion models date back to 2015 with Sohl-Dickstein et al.'s paper "Deep Unsupervised Learning using Nonequilibrium Thermodynamics." However, they only became practical for high-quality generation after Ho et al.'s 2020 paper "Denoising Diffusion Probabilistic Models" (DDPM), which showed they could match GAN quality. The five-year gap between theory and practice is a reminder that good ideas sometimes need the right implementation details to shine.

Formally, the noisy sample at the time step tt is computed as:

Where:

  • x0\mathbf{x}_0: Original clean sample (e.g., an image)

  • xt\mathbf{x}_t: Noisy version of the sample at the timestep ...