A Simple PHP Tokenizer
Explore the process of tokenizing PHP code by combining PHP's built-in tokenizer with Laravel's collection features. Learn to implement cursors like VariableCursor, NumberCursor, and OpenTagCursor to iterate and parse PHP variables, numbers, and opening tags. This lesson guides you through creating a simple lexer to produce token arrays similar to PHP's token_get_all function, enhancing your string splitting skills in Laravel.
Tokenizing PHP code
Let’s consider the example in the code below. Our example uses PHP’s token_get_all function to return the results of PHP’s tokenizer on the input string. We then use Laravel’s collection features to provide a friendlier name for each of the returned tokens:
Our example would produce the following output:
Each element in our resulting array corresponds to some of our input text. For instance, the second element corresponds to the $value variable on line 2. The first value in all of our nested arrays contains the token identifier, the second value holds the contents of the match, and the third value includes the line number the token is located on. We can also see that values such as the + and = ...