Trusted answers to developer questions
Trusted Answers to Developer Questions

Related Tags

python
communitycreator

How to generate WordClouds in Python

Harsh Jain

What is WordCloud?

In NLP, when you want to discover words that occur most in your text data, you can build a WordClouda cloud(image) that contain different words of different sizes. In a WordCloud, the size of each word denotes the frequency or importance of that word in the text data.

A sample WordCloud image is shown below:

Sample WordCloud

In the WordCloud above, we can see that the words Shall, State, and United are the most important words in the complete text.

Build your own WordCloud

Now, let’s build our own WordCloud. However, before we can do that, we need to install some packages. Do this by running:

pip install wordcloud
pip install matplotlib
pip install numpy

Take a look at the code:

main.py
constitution.txt
from wordcloud import WordCloud
import matplotlib.pyplot as plt

text = open('./constitution.txt').read()

wordcloud = WordCloud().generate(text)

fig = plt.figure()
plt.imshow(wordcloud, interpolation='bilinear')
plt.axis("off")
fig.savefig('output/img.png')
plt.close(fig)
Build a WordCloud in Python

Explanation:

  • We have the constitution.txt text file that contains our text data. (You can replace your own text data in that file).
  • In lines 1 and 2, we import the required packages.
  • In line 4, we read the text data from our file.
  • In line 6, we generate the WordCloud by providing our own text data.
  • In line 8, we create the figure using matplotlib.
  • In line 9, we display the WordCloud. We also provide the interpolation as bilinear to make the image look smoother.
  • In line 10, we remove the axes from the plot.
  • Finally, in lines 11 and 12, we save our figure and close it.

When you run the code, you will see a WordCloud getting created. You can then test it using your own text data.

RELATED TAGS

python
communitycreator
RELATED COURSES

View all Courses

Keep Exploring