Generative AI

Learn about the basics of advanced AI architectures and their real-world applications.

While convolutional neural networks (CNNs), their derivatives, and other deep learning architectures resulted in immense progress in AI-based applications, in the last decade, machine learning researchers and engineers started exploring models that could learn representations from data to generate new content. This evolution resulted in the rise of generative models such as variational autoencoders and generative adversarial networks.

Encoder-decoder architectures

These encoder-decoder architectures, such as variational autoencoders (VAEs), became a popular choice in applying probabilistic inference to encode the data into a representational latent space while imposing reasonable constraints for smooth interpolation and manipulations. In contrast, generative adversarial networks (GANs) consist of two competing neural networks—a generator that can create synthetic samples of the data and a discriminator block that evaluates whether the generated content is real or fake.

These theoretical ideas resulted in several high-impact products; for example, StyleGAN by NVIDIA generates photorealistic synthetic images of human faces, bodies, cars, indoor scenes, and more that are hard to distinguish from real images. Because of this, StyleGAN has become a base technology for creative applications in arts, media, and gaming. Although NVIDIA reported more than $100 million in sales related to just GAN-based research in 2020, the global revenue generated by the GAN-based industry has been in the billions of dollars. While enormous progress has been reported in the last decade on AI product development, a new type of model architecture with a specialized feature of self-attention maps was reportedly developed by Google called the Transformer. The creation of transformers has disrupted the entire AI ecosystem today with their broad range of applications in handling single as well as multimodal datasets.

The next section will examine the basics of the Transformer architecture and discuss how it is used as a backbone model architecture in almost every generative AI framework today. 

The evolution of AI

To get a glimpse of the evolution of AI architectures in the last decade, let’s look at the figure below, which highlights the block diagrams of highly applicable deep learning models used in several AI products ranging from image classification and forecasting to content generation.

Get hands-on with 1200+ tech skills courses.