Building a Basic Word Counter

Learn how to create a simple word counter.

Whether we’re looking to automate a task, analyze data, parse logs, talk to network services, or address other requirements, writing our own command-line tool may be the fastest—and perhaps the most fun—way to achieve our goal. Go is a modern programming language that combines the reliability of compiled languages with the ease of use and speed of dynamically typed languagesIn Dynamic Typing, the type of a variable is determined at runtime. Python is an example of a dynamically typed language. In other words, the type of a variable can change over its lifetime.. It makes writing cross-platform command-line applications more approachable while also providing the features required to ensure these tools are well-designed and tested.

Before we dive into more complex programs that read and write files, parse data files, and communicate over networks, we’ll create a word counter program that will give us an idea of how to build and test a command-line application using Go. We’ll start with basic implementation, add some features, and explore test-driven development along the way. When we’re done, we’ll have a functional word counter program and a better understanding of how to build more complex apps.

Throughout the course, we’ll develop other CLI applications to explore more advanced concepts.

Basic word counter app

Let’s create a tool that counts the number of words or lines provided as input using the standard input (STDIN) connection. By default, this tool counts the number of words, unless it receives the -l flag, in which case it will count the number of lines instead.

We’ll start by creating the basic implementation. This version reads data from STDIN and displays the number of words. We’ll eventually add more features, but this initial version will let us get comfortable with the code for a Go-based command-line application.

Before diving into writing code for the word counter, let’s set up a project directory.

Go programs are composed of packages. A package consists of one or more Go source code files with code that can be combined into executable programs or libraries.

Starting with Go 1.11, we can combine one or more packages into Go modules. Modules are a new Go standard for grouping related packages into a single unit that can be versioned together. Modules enable consistent dependency management for our Go applications. For more information about Go modules, consult the official wiki page.

Initialize the Go module

Let’s initialize a new Go module for our project:

Press + to interact
go mod init usercode/firstProgram/wc

We create an executable program in Go by defining a package named main that contains a function called main(). This function takes no arguments and returns no values. It serves as the entry point for our program:

Press + to interact
package main
func main() {
main contents
}

Although not a requirement, by convention, the main package is usually defined in a file named main.go. We’ll use this convention throughout this course.


Code example file path

Note: For brevity, the code example path omits the root directory root@educative:/usercode/. For example, in the following code sample, the code path starts at firstProgram/wc.

The main.go file

Let’s add the package main definition to the top of the file like this:

package main
Adding the package

Next, we add the import section to bring in the libraries we’ll use to read data from STDIN and print results:

import (
"bufio"
"fmt"
"io"
"os"
)
Adding the import section

For this tool, we import:

  • The bufio package to read text.
  • The fmt package to print formatted output.
  • The io package, which provides the io.Reader interface.
  • The os package, so we can use operating system resources.

Our word counter will have two functions: main() and count(). The main() function is the starting point of the program. All Go programs that will be compiled into executable files require this function. We create this function by adding the following code into our main.go file. This function will call the count() function and print out that function’s return value using the fmt.Println() function:

func main() {
// Calling the count function to count the number of words
// received from the Standard Input and printing it out
fmt.Println(count(os.Stdin))
}
Adding the main() function

Next, we define the count() function, which will perform the actual counting of the words. This function receives a single input argument: an io.Reader interface. For now, think of an io.Reader as any Go type from which we can read data. In this case, the function will receive the contents of the STDIN to process:

func count(r io.Reader) int {
// A scanner is used to read text from a Reader (such as files)
scanner := bufio.NewScanner(r)
// Define the scanner split type to words (default is split by lines)
scanner.Split(bufio.ScanWords)
// Defining a counter
wc := 0
// For every word scanned, increment the counter
for scanner.Scan() {
wc++
}
// Return the total
return wc
}
Adding the count() function

The count() function uses the NewScanner() function from the bufio package to create a new scanner. A scanner is a convenient way of reading data delimited by spaces or newlines. By default, a scanner reads lines of data, so we instruct the scanner to read words instead by setting the Split() function of the scanner to bufio.ScanWords(). We then define a variable, wc, to hold the word count and increment it by looping through each token using the scanner.Scan() function and adding 1 to the counter each time. We then return the word count.

In this example, for simplicity’s sake, we’re ignoring the error that might be generated during the scanning. In our code, we always check for errors.

We’ve completed the basic implementation of the word count tool. Next, we’ll write tests to ensure this implementation works the way we expect it to.