How to match whitespaces but not newlines in Perl

To match whitespace characters and avoid the new line matching in a regular expression in Perl, we can create a group of characters that includes anything that is not a whitespace character (using the "S" complement), not a carriage return, and not a newline.

RegEx

We'll use the following regular expression statement to solve the problem:

/[^\S\r\n]/

Let's break it down to know how it works:

  • / /

    • / /: This creates an empty regular expression pattern.

  • /[ ]/

    • [ ]: This indicates that we want to match any character within the brackets.

  • /[^ ]/

    • ^ : The caret symbol at the beginning of a character class negates the match, so this regex pattern will match any character that is not within the brackets.

  • /[^\S\r\n]/

    • \S: This matches any non-whitespace character, \r matches a carriage return, and \n matches a newline. Hence, this character class will match any character that is not a whitespace character, a carriage return, or a newline.

Code example

The following Perl script identifies whether certain whitespace characters match a regular expression:

#! /usr/bin/env perl
use strict;
use warnings;
use 5.005; # for q
my $whitespace_except_newline = qr/[^\S\r\n]/;
for (' ', '\t', '\f', '\n', '\r') {
my $pp = qq["$_"];
printf "%-4s --> %s\n", $pp,
(eval $pp) =~ $whitespace_except_newline ? "matched" : "not matched";
}

Code explanation

  • Line 1: This is known as the shebang line, and it specifies the path for the Perl interpreter to use when running the script.

  • Lines 3–4: These are pragmas that enable strict checking and warnings when running the Perl script.

  • Line 6: This line specifies that the minimum required version of Perl is 5.005. It is also used to enable the use of the qr operator.

  • Line 8: This creates a regular expression that matches any whitespace character except for carriage return (\r) and newline (\n). The regular expression is stored in a variable called $whitespace_except_newline.

  • Lines 10–13: This for loop iterates over a list of whitespace characters (space, tab, form feed, newline, and carriage return), are represented as string literals. For each character, it creates a double-quoted string that evaluates the character and prints out whether the regular expression $whitespace_except_newline matches that character or not. The output is formatted using the printf function, and the eval function is used to evaluate the double-quoted string as Perl code.

Free Resources

Copyright ©2025 Educative, Inc. All rights reserved