Search⌘ K
AI Features

Unicode and UTF-8 strings

Explore how Go represents Unicode and UTF-8 encoded strings. Understand the difference between bytes and characters in strings, the use of the rune type, and how to properly iterate over Unicode text with for-range loops to handle international characters accurately.

Unicode and UTF-8

Unicode and UTF-8 are hairy subjects.

Let’s have a quick recap of Unicode and UTF-8:

  1. Unicode is an international encoding standard for use with different languages and scripts, by which each letter, digit, or symbol is assigned a unique numeric value that applies across different platforms and programs. Essentially it’s a big table of “code points”. It contains most (but not all) of the characters of all languages. Each code point is an index in that table which you can
...