Generating Photo-Realistic Images with StackGAN++

Explore how to create photo-realistic images from textual descriptions using StackGAN++. Learn the two-stage generation process, conditioning augmentation, and how the multi-branch structure improves image quality. Understand the key differences and advantages of StackGAN++ over the original StackGAN model, including multi-scale synthesis, unconditional loss, and color-consistency.

We'll cover the following...

High-resolution text-to-image synthesis with StackGAN
From StackGAN to StackGAN++

The generation of images from description text can be considered a conditional GAN (CGAN) process in which the embedding vector of the description sentence is used as the additional label information. We need to figure out how to generate large images with CGAN. It’s also possible to stack two CGANs together so that we can get high-quality images. This is exactly what StackGANZhang, Han, Tao Xu, Hongsheng Li, Shaoting Zhang, Xiaogang Wang, Xiaolei Huang, and Dimitris N. Metaxas. "Stackgan: Text to photo-realistic image synthesis with stacked generative adversarial networks." In Proceedings of the IEEE international conference on computer vision, pp. 5907-5915. 2017. does.

High-resolution text-to-image synthesis with StackGAN

The embedding vector, $\varphi_t$ , of the description sentence is processed by the conditioning augmentation step to create a conditional vector, $c$ . In conditioning augmentation, a pair of mean, $\mu$ , and standard deviation, $\sigma$ , vectors are calculated from the embedding vector, $\varphi_t$ , to generate the conditional vector, $c$ , based on the Gaussian distribution, $\mathcal N(\mu,\sigma^2)$ . This process lets us create many more unique conditional vectors from limited text embeddings and ensure that all the conditional variables obey the same Gaussian distribution. At the same time, $\mu$ and $\sigma$ are restrained so that they are not too far away from ...

1.Getting Started

2.Generative Adversarial Networks Fundamentals

3.Best Practices for Model Design and Training

4.Building Our First GAN with PyTorch

5.Generating Images Based on Label Information

6.Image-to-Image Translation and Its Applications

7.Image Restoration with GANs

8.Training GANs to Break Different Models

9.Image Generation from Description Text

10.Sequence Synthesis with GANs

11.Reconstructing 3D Models with GANs

12.Concluding Remarks

13.Appendix

Generating Photo-Realistic Images with StackGAN++

High-resolution text-to-image synthesis with StackGAN