Strings, Regex & Unicode
We'll cover the following...
String Methods
Recently introduced in JavaScript are a few string methods that are more convenient methods for working with strings. These methods include .startsWith, .endsWidth, .includes, and .repeat.
Previously if we wanted to search for the existence of some text in a string we would have to use regex. Now there are a few methods that help us make this a bit easier.
.startsWith
If you wanted to see if a string started with a specific bit of text, we could have created a regular expression that looked like this.
This regular expression would search for the work JavaScript at the beginning of our string. With the .startsWith method, we cause do that same thing.
The method will return true or false if the text you are looking for appears in the at the start of the string.
.endsWith
Similar to .startsWith the .endsWith method will check to see if a string ends with a specific bit of text. If we wanted to do this in with a regular expression we could match it like this.
With the .endsWith method will allow us to make this check without having to write any regex, and it will simply return true or false after the check.
.includes
If we can check weather a string starts or ends with some text we should also be able to check if it includes some text. The .includes method will allow us to do just this. This is again another convenience method for something we could perform with a regular expression.
The .includes method checks the entire string for the provided text, note the check is case sensitive!
Just like .startsWith and .endsWith the .includes method will return true or false.
Start position
Unlike the other methods, .includes takes a second optional position parameter. This is used to tell .includes about where it should start checking in the string for the value.
We will see this again when we get to the .includes method in the ES7(ES2016) & Beyond chapter.
.repeat
One more method to look at is the .repeat method is pretty straight forward, it allows us to repeat a string a given number of times.
There are a few exceptions to what you can pass in as the count. It can not be a negative number and if it is a decimal number it will be rounded to an integer.
Unicode
New in ES6 for Unicode is the ability to represent Unicode as code points. Previously this was not possible because you could only represent a unicode character with up to 4 hexadecimal digits. So any Unicode character that required more that 4 you needed to create what is called a surrogate pair.
However in ES6 we can use the \u{} syntax to include up to 6 digits, enough to represent all the Unicode characters. It is pretty straight forward.
Regex
Regex in ES6 also got a few additions. There are two new flags available to use, the y or sticky flag, and the u flag. The u flag is used for unicode characters. For example, say we have a really cool bit of text.
And we wanted to see if there was the rocket ship emoji in there. Well we could do something like this
Note you could actually do this as well .match(/🚀/), you will notice that we do not need to use the u flag here. The u flag is used when we pass a unicode code point \u{} to our expression. We can also do a range, so assume we wanted to get all the emojis used in a bit of text.
Using the u flag we could match on a range of characters.
This will match globally, g, all the faces from 😀 to 😷. And again we could just use the emjoi if we wanted .match(/[😀-😷]/)
Sticky match
The last thing I wanted to talk about with regex is the y flag, this is the sticky flag. It is used to allow us the chance to determine where to start our search. Before we dive into it, we need to look at the lastIndex property.
When you use the g flag and run an exec it will find the first match and set the lastIndex property. The next time you run the exec method it will use that index and start from there. With the y or sticky flag, we can set the lastIndex to let RegEx know where to start looking.
This can be helpful if you need to check for a bit of text starting from a specific point in your code.