Exercise: Counting Unicode Characters

Test your coding skills on counting Unicode characters.

Challenge

In this exercise, your challenge is to expand the count function to count Unicode characters.

Problem statement

In addition to counting lines and words, your tool can also count the number of Unicode characters provided with the input.

Computers encode text using different standards. Text encoded in ASCII uses one byte per character. Therefore, to count characters for text encoded with this standard, it’s usually enough to count the number of bytes. This is useful for languages that support this encoding, such as English, but it might not be enough to correctly count the number of characters for languages encoded using Unicode standard, such as Japanese, because it might use more than one byte per character.

Go supports the Rune data type to represent Unicode characters (or code points). Expand the program to count runes in addition to words and lines.

Coding challenge

Take some time to figure out the smartest way to solve this problem. Start from the implementation of the count function at the end of this chapter. Expand the count function to receive a new boolean parameter named countRunes. If this parameter is set to true, the function should return the number of runes in the provided input text.

If you feel stuck, refer to Go’s documentation for runes or for the bufio package. If you still need help, check the solution review in the next lesson. Good luck!

Note: If you’re looking for an extra challenge, write a function for counting runes in addition to Unicode characters.

Create a free account to view this lesson.

By signing up, you agree to Educative's Terms of Service and Privacy Policy