Introduction to Deep Learning & Neural Networks/

...

Variational Autoencoder: Theory

Dive into the mathematics behind variational autoencoders.

We'll cover the following...

Train a variational autoencoder
Reparameterization trick

In simple terms, a variational autoencoder is a probabilistic version of autoencoders.

Why?

Because we want to be able to sample from the latent vector ( $z$ ) space to generate new data, which is not possible with vanilla autoencoders.

Each latent variable $z$ that is generated from the input will now represent a probability distribution (or what we call the posterior distribution denoted as $p(z|x)$ ).

All we need to do is find the posterior $p(z|x)$ or solve the inference problem.

In fact, the encoder will try to approximate the posterior by computing another distribution $q(z|x)$ , known as the variational posterior.

Note that a probability distribution is fully characterized by its parameters. In the case of the Gaussian, these are the mean $\mu$ and the standard deviation $\sigma$ .

So it is enough to pass the parameters (mean $\mu$ and the standard deviation $\sigma$ ) of the normal probability distribution — denoted as $N(\mu, \sigma)$ in the decoder — instead of simply passing the latent vector $z$ like the simple autoencoder.

Then, the decoder will receive the distribution parameters and try to reconstruct the input x. However, this statement is factually incorrect because you cannot compute the gradients of a constantly changing operation (stochastic). In other words, you cannot backpropagate through a sampling operation. This is exactly the heart of learning to train variational autoencoders.

Let’s see how we can make it possible. (Hint: Check the reparameterization trick section below.)

Train a variational autoencoder

First things first.

Since our goal is for the variational posterior $p(z|x)$ to be as close as possible to the true posterior, the following loss function is used to train the model.

L_{\theta,\phi}(x) = E_{q_{\phi}(z|x)} [ log p_{\theta}(x|z) ] - KL(q_{\phi}(z |x) || p_{\theta}(z))

Learn Deep Learning

Neural Networks

Training Neural Networks

Convolutional Neural Networks

Recurrent Neural Networks

Autoencoders

Generative Adversarial Networks

Attention and Transformers

Graph Neural Networks

Conclusion

Final Quiz

Variational Autoencoder: Theory

Train a variational autoencoder