Unicode and UTF-8 strings
Explore how Go represents Unicode and UTF-8 encoded strings. Understand the difference between bytes and characters in strings, the use of the rune type, and how to properly iterate over Unicode text with for-range loops to handle international characters accurately.
We'll cover the following...
We'll cover the following...
Unicode and UTF-8
Unicode and UTF-8 are hairy subjects.
Let’s have a quick recap of Unicode and UTF-8:
- Unicode is an international encoding standard for use with different languages and scripts, by which each letter, digit, or symbol is assigned a unique numeric value that applies across different platforms and programs. Essentially it’s a big table of “code points”. It contains most (but not all) of the characters of all languages. Each code point is an index in that table which you can