To match whitespace characters and avoid the new line matching in a regular expression in Perl, we can create a group of characters that includes anything that is not a whitespace character (using the "S" complement), not a carriage return, and not a newline.
We'll use the following regular expression statement to solve the problem:
/[^\S\r\n]/
Let's break it down to know how it works:
/ /
/ /
: This creates an empty regular expression pattern.
/[ ]/
[ ]
: This indicates that we want to match any character within the brackets.
/[^ ]/
^
: The caret symbol at the beginning of a character class negates the match, so this regex pattern will match any character that is not within the brackets.
/[^\S\r\n]/
\S
: This matches any non-whitespace character, \r
matches a carriage return, and \n
matches a newline. Hence, this character class will match any character that is not a whitespace character, a carriage return, or a newline.
The following Perl script identifies whether certain whitespace characters match a regular expression:
#! /usr/bin/env perluse strict;use warnings;use 5.005; # for qmy $whitespace_except_newline = qr/[^\S\r\n]/;for (' ', '\t', '\f', '\n', '\r') {my $pp = qq["$_"];printf "%-4s --> %s\n", $pp,(eval $pp) =~ $whitespace_except_newline ? "matched" : "not matched";}
Line 1: This is known as the shebang line, and it specifies the path for the Perl interpreter to use when running the script.
Lines 3–4: These are pragmas that enable strict checking and warnings when running the Perl script.
Line 6: This line specifies that the minimum required version of Perl is 5.005. It is also used to enable the use of the qr
operator.
Line 8: This creates a regular expression that matches any whitespace character except for carriage return (\r)
and newline (\n)
. The regular expression is stored in a variable called $whitespace_except_newline
.
Lines 10–13: This for
loop iterates over a list of whitespace characters (space, tab, form feed, newline, and carriage return), are represented as string literals. For each character, it creates a double-quoted string that evaluates the character and prints out whether the regular expression $whitespace_except_newline
matches that character or not. The output is formatted using the printf
function, and the eval
function is used to evaluate the double-quoted string as Perl code.
Free Resources