Search⌘ K
AI Features

Iterating Multibyte Strings

Explore techniques to iterate multibyte UTF-8 strings correctly in PHP using Laravel. Understand issues with standard byte iteration, memory implications of common methods, and how to implement custom UTF-8 string length and iteration functions for efficient string manipulation.

UTF-8 string iterator implementation contains the full implementation for the class we will develop throughout this chapter.

Issues with multibyte iteration

In the previous lesson, we implemented several string functions that worked by iterating a string’s characters. Treating each byte as a single character is fine when working with ASCII, but as soon as we need to work with multibyte characters, things become a little more complicated. As an example, if we attempted to iterate “這可以” using the following. After pressing the “Run” button, the output can be viewed by clicking the app link under the “Run” button

php:
  preset: laravel
  version: 8
  disabled:
    - no_unused_imports
  finder:
    not-name:
      - index.php
      - server.php
js:
  finder:
    not-name:
      - webpack.mix.js
css: true
String iterate example

Our output string would appear corrupted:

0: Ú1: Ç2: Ö3: Õ4:
...