Transformer Architecture: Encoder, Decoder, and Computing Output

Transformer architecture

A transformer is a type of seq2seq model. Transformer models can work with both image and text data. The transformer model takes in a sequence of inputs and maps that to a sequence of outputs.

The transformer model was initially proposed in the paper Attention Is All You NeedVaswani et al. (https://arxiv.org/pdf/1706.03762.pdf). Just like a seq2seq model, the transformer consists of an encoder and a decoder:

Get hands-on with 1200+ tech skills courses.