Recurrent Neural Networks: What, When, How and Where

This lesson discusses recurrent neural networks (RNNs) in detail.

Introduction

Your smartphone predicting your next word when you are typing, Alexa understanding what you are saying and tasks like making stock market predictions, understanding movie plots, composing music, language translation, and human learning: can you see what the common theme in this list is? These are examples where a sequence of information is crucial.

For example, look at the following sentence:

I like to have a cup of ___ for breakfast.

You are likely to guess that the missing word is coffee. But why didn’t you think of sandwich or ball? Our brains automatically use context, or words earlier in the sentence, to infer the missing word. We are wired to work with sequences of information, and this allows us to learn from experiences — “we are but a total of our experiences.” From language to audio to video, we are surrounded by data where the information at any point in time is dependent on the information at previous steps. This means that for working with such data we need our neural networks to access and understand past data. Vanilla neural networks cannot do this because they assume that all inputs and outputs are independent of each other; there is a fixed-sized vector as input and a fixed-sized vector as output. This is where RNNs come into play.

How RNNs works

RNNs are called “recurrent” neural networks because they can work with sequences by having a mechanism that can persist and access data from previous points in time. This allows the output to be based on time-dependent or sequential information (you can try to think of a memory-based system). Basically, these networks have loops for persisting and accessing information from sequences as input.

Get hands-on with 1200+ tech skills courses.