Search⌘ K
AI Features

Counting Words from a File

Explore how to build a simple Python word counter by reading text files and utilizing string methods such as split and len. Understand the process of breaking text into words and counting them to develop basic text analysis skills.

Previously, we learned how to read a file and then print it onto the screen. Now, let’s take another step forward and count the number of words present in a file. Let’s look at the code widget at the end of the lesson.

Lines 3-5 should be familiar by now. First, we’ll open the file, read it, and print it.

Let’s look at an example of a new code.

Using the split() function

Python has several inbuilt functions for strings. For instance, the split() function splits the string on the given parameter. In the example above, we’re splitting a space. The function returns a list (which Python calls arrays) of the string split on space. To see how this works, let’s fire up an IPython console below!

Click the “Click to launch app!” button below to see how the split() function works.

Please login to launch live app!

Now add the following line of code in the notebook to see what happens.

"The birds, they are flying away, he said." .split(",")

We’ll see the following output in the Out line.

Out[2]: ['The birds',' they are flying away',' he said.']

Great! Now that we’re familiar with the spit() function, let’s come back to our example.

Let’s look at line 7 in the code widget below.

We should have an idea of what we are doing now. By splitting the file we read on spaces, we should get the number of words in English that are separated by space.

Using the len() function

So we printed the words that we found in line 9. Next, we’ll call the len() function, which returns the length of a list on line 10. Remember we said the split() function breaks the string into a list? By using the len() function, we can determine how many elements the list has, and hence the number of words.

Here it is, our very own word counting program!

Python 3.8
#! /usr/bin/python
f = open("birds.txt", "r")
data = f.read()
f.close()
words = data.split(" ")
print("The words in the text are:")
print(words)
num_words = len(words)
print("The number of words is ", num_words)