Evolving From Fine-Tuning to Zero-Shot Models

Learn about how fine-tuning evolved into zero-shot transformer models.

We'll cover the following

Overview
Stacking decoding layers
GPT-3 engines

Overview

From the start, OpenAI’s research teams, led by Radford et al. (2018), wanted to take transformers from trained models to GPT models. The goal was to train transformers on unlabeled data. Letting attention layers learn a language from unsupervised data was a smart move. Instead of teaching transformers to do specific NLP tasks, OpenAI decided to train transformers to learn a language.

OpenAI wanted to create a task-agnostic model. So, they began to train transformer models on raw data instead of relying on labeled data by specialists. Labeling data is time-consuming and considerably slows down the transformer’s training process.

Get hands-on with 1400+ tech skills courses.

Course Introduction

Basics of Transformers

Architecture of the Transformer Model

Fine-Tuning BERT Models

Pretraining a RoBERTa Model from Scratch

Downstream NLP Tasks with Transformers

Machine Translation with the Transformer

The Rise of Transformers with GPT-3 Engines

Applying Transformers for AI Text Summarization

Matching Tokenizers and Datasets

Transformers for NLP—Exam 1

Semantic Role Labeling with BERT-Based Transformers

Your Data Speaks: Story, Questions, and Answers

Detecting Customer Emotions to Make Predictions

Analyzing Fake News with Transformers

Interpreting Black Box Transformer Models

From NLP to Task-Agnostic Transfomer Models

The Emergence of Transformer-Driven Copilots

Transformers for NLP—Exam 2

Conclusion

Appendix

Evolving From Fine-Tuning to Zero-Shot Models

Overview