Trusted answers to developer questions
Trusted Answers to Developer Questions

Related Tags

neural network
deep learning

What are Convolutional Neural Networks for NLP?

Ibrahim Moazzam

Convolutional Neural Networks (CNN)

Convolutional Neural Networks (CNN) are incredibly powerful models making immense waves in the realm of Machine Learning by taking on complex tasks such as Image/Video Recognition and Analysis.

Inspired by the visual cortexThe visual cortex is a region of the brain that receives, integrates, and processes visual information received from the retinas, the architecture of CNNs is such that it can assign importance to objects in the input by figuring their features out and then differentiating those objects from each other. To make this possible, the CNN contains various layers—the Convolution, ReLU, and Pooling layers.

Figures 1 and 2 illustrate a CNN's pipeline, from the input to the output. While Figure 1 details the half responsible for Feature Extraction, Figure 2 details the Classification half of the CNN for an application to be fed images and then determine the presence of koala bears in them.

Figure 1: Feature extraction
Figure 2: Classification

Natural Language Processing (NLP)

Now that we're well-versed with Convolutional Neural Networks (CNNs), let's briefly look at the domain we wish to apply them to.

Natural Language Processing (NLP), at its absolute minimum, is what happens when a computer is tasked with listening to words and sentences in a natural language. This is referred to as NLUNatural Language Understanding. It then forms some sort of comprehension from them. This is referred to as NLGNatural Language Generation.

Figure 3: NLP pipeline

Using both NLU and NLG, there are a host of applications for NLP. Some of these include the following:

  • Machine Translation
  • Virtual Assistants
  • Sentiment Analysis

Note: For more information on NLP, refer to this Answer

Convolutional Neural Networks for NLP

Now that we're up to speed with Convolutional Neural Networks (CNNs) and Natural Language Processing (NLP), let's go over how we're supposed to go about fusing the two.

Heads up, it's incredibly simple! The CNN illustrated in Figures 1 and 2 for our koala detection application works incredibly well, but currently, inputs are provided in the form of images.

How do we go from CNNs that work for images to CNNs that work for NLP?

For starters, the inputs to the CNN for NLP will be in the form of sentences, not images. However, you may have noticed from Figure 1 that the input type has no effect on the illustrated methodology as long as we can convert it into an array of numbers for feature extraction and classification.

Fortunately, sentences can be represented as an array of vectors, wherein each word of the sentence—known as a "token"—is a vector in a vector space comprising the entire training vocabulary. This is known as "Word Embeddings."

Total vocabulary from all the sentences
1 of 2

Given this array of vectors, we can duplicate the pipeline in Figures 1 and 2, while making changes to the filters and pooling operations appropriate to the NLP application we'd like to perform.

A diagram illustrating the pipeline for Sentence ClassificationAssigning a set of labels (or classes) to a sentence using CNNs can be seen below.

Sentence classification using a CNN

RELATED TAGS

neural network
deep learning
RELATED COURSES

View all Courses

Keep Exploring