Sequence to Sequence Models

Learn how to process sequences of data (most notably text) with deep learning.

To understand attention, we need to discuss how it emerged from natural language applications. Machine translation, translating an input sentence from one language to another, is a great use case.

So how did we process text and sentences in natural language before?

Formulating text in terms of machine learning

Let’s start by formulating the problem in terms of machine learning.

Thinking in terms of input and output first will help us grasp this category of models.

The representation is quite intuitive: sentences can be regarded as sequences of words.

Before showing any crazy model, let’s take a comprehensive look. We represent the input and output sentences as sequences.

Seq2Seq

Ideally, a model has to understand the input sentence in one language. This is captured in the so-called “encoder”. It produces the intermediate representation, denoted as z, in the diagram.

We need to convert the meaning into another language, so let’s call this model decoder.

In fact, the terminology is not any different.

Get hands-on with 1200+ tech skills courses.