Search⌘ K
AI Features

String Length, Literals and Comparison

Explore how to handle string lengths, Unicode characters, and literals in D. Understand string concatenation and lexicographic comparison to manage text accurately and avoid common Unicode pitfalls.

Potentially confusing length of strings

We have seen that some Unicode characters are represented by more than one byte. For example, the character ‘é’ (the latin letter ‘e’ combined with an acute accent) is represented using at least two bytes in UTF-8 encoding. This fact is reflected in the .length property of strings:

D
import std.stdio;
void main() {
writeln("résumé".length);
}

Although résumé contains six letters, the length of the string is the number of UTF-8 code units that it contains i.e, 8.
Here résumé is a string literal where each element type is a char and each char value represents a UTF-8 code unit. The type of the elements of string literals like “hello” is char and ...