The new Character Type of UTF-8 Strings: char8_t
Get introduced to a new character type, 'char8_t'.
We'll cover the following...
In addition to the character types char16_t and char32_t from C++11, C++20 gets the new character type char8_t. Type char8_t is large enough to represent any UTF-8 code unit (8 bits). It has the same size,
🔑
charversuschar8_tA
charhas one byte. In contrast to achar8_t, the number of bits of a byte and hence of acharis not defined. Nearly all implementations use bits for a byte. Thestd::stringis an alias for astd::basic_stringof chars.std::string std::basic_string<char> "Hello World"
Consequently, C++20 has a new typedef for the character type char8_t (line 1) and a new UTF-8 string literal (line 2).