Types of Regular Expressions

In this lesson, we will classify regular expressions into two main types: POSIX and Perl regular expressions.

You can probably classify RegExps in very different ways, but, if you look at the way they were created (and that is what we’re doing in this chapter), there are two main groups:

  1. POSIX

  2. Perl Regular Expressions

POSIX

POSIX stands for Portable Operative System Interface, and it is basically a set of standards defined for the creation of new OS or tools that need to be compatible with them. By following these standards (which include a set of APIs to interact with and a set of command-line shells and utilities that must be present), your software is automatically compatible with Unix systems and other OS that follow POSIX.

Although its first version (POSIX.1) was released around 1988, regular expressions were added to it in 1992 for version POSIX.2. As of the writing of this course, this version of regular expressions is the oldest flavor in use today. The main characteristic of this flavor of RegExp is that all special characters (which are characters that actually have extra meaning) need to be escaped (i.e., prefixed with a \ character) to be recognized as such.

PCRE (Perl Compatible Regular Expressions)

Since the goal of the Perl programming language was to be a flexible text processing tool, regular expressions were a natural add-on to it. In fact, so much so that RegExps are a native data type. Other languages require you to use external libraries that implement regular expressions but not Perl. You can just use two / characters, one at the start and one at the end, to declare a RegExp.

That being said, JavaScript implements a subset of this flavor of regular expressions, so this is the one you’ll want to pay attention to. The main difference with the POSIX version is that special characters are directly supported without needing to be escaped. You do have to escape those characters if you’re interested in using them for their normal value (i.e., using ., which is a special character that references any other character, is not the same as using \., which simply references the dot character).

Now that we’ve covered the basic introduction to the course, let’s dive headfirst into understanding what makes up a regular expression and what all those strange characters mean.

Get hands-on with 1200+ tech skills courses.