Search⌘ K
AI Features

Pix2pixHD: High-Resolution Image Translation

Explore the pix2pixHD model for high-resolution image-to-image translation, understanding its two-stage generator approach and multi-scale discriminator design. Learn how it enhances image quality up to 2048 by 1024 resolution by combining global and local features, using instance boundary maps and feature matching loss. This lesson helps you grasp the architecture and training requirements of pix2pixHD to generate detailed and realistic imagery.

We'll cover the following...

Pix2pixHDWang, Ting-Chun, Ming-Yu Liu, Jun-Yan Zhu, Andrew Tao, Jan Kautz, and Bryan Catanzaro. "High-resolution image synthesis and semantic manipulation with conditional gans." In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 8798-8807. 2018. is an upgraded version of the pix2pix model. The biggest improvement of pix2pixHD over pix2pix is that it supports image-to-image translation at 2048×10242048\times1024 resolution and with high quality.

Model architecture

To make this happen, they designed a two-stage approach to gradually train and refine the networks, as shown in the following diagram. First, a lower-resolution image of 1024×5121024 \times512 is generated by a generator network G1G_1, called the global generator (the red box). Second, the image is enlarged by a generator network G2G_2 ...