Trusted answers to developer questions
Trusted Answers to Developer Questions

Related Tags

regex
golang
communitycreator

Greedy, reluctant, and possessive quantifiers for RegEx in Go

Omoyemi John Arigbanla

Introduction

A quantifier in RegEx is a meta-character that has special meaning if it occurs at the beginning of an expression. If it occurs anywhere else, it’s treated as itself and simply matches characters—or not, depending on what type of quantifier it is.

Greedy, reluctant, and possessive quantifiers

The differences between greedy, reluctant, and possessive quantifiers in the regular expression are given in the table below:

Greedy Reluctant Possessive
1. It matches as much text as possible. 1. It refers to how reluctant a pattern may be to consume some text when there are better matches available elsewhere. 1. It just tries to match as little as possible.
2. a+ matches one or more consecutive a character(s). 2. a? matches an a character or nothing. 2. a* matches zero or more consecutive a character(s).

Let’s discuss all these approaches with coding examples in Go.

Greedy – +

A greedy quantifier matches as much text as possible. In other words, a + character attached to a word or series of characters tries to eat up as many of those letters as possible before stopping. We may use a greedy quantifier when searching for multiple keywords at once.

Example

l+ matches one or more consecutive l characters.

package main

import (
    "regexp"
    "fmt"
)

func main() {
    var re = regexp.MustCompile(`l+`)
    var str = `hello lullaby`
    
    for i, match := range re.FindAllString(str, -1) {
        fmt.Println(match, "found at index", i)
    }
}

Explanation

  • Lines 4–5: We import the regex and fmt packages.
  • Line 9: We create a variable that parses a regular expression and returns, if successful, a regexp object that can be used to match against the text.
  • Line 10: This is the string or text that line 9 is matched against.
  • Line 12: This is a for loop that prints a slice of all the successive matches of the expression along with their indices.

The word hello ismatched as “he ll o” which is equal to a match because the number of consecutive characters counts as one, while the word lullaby is matched as "l u ll aby", which equals two matches.

Reluctant – ?

The reluctant quantifier is indicated by a question mark at either end of its pattern space. The match that’s immediately before or after it isn’t included when it executes. Instead, it just tries to match as little as possible. The term reluctant doesn’t refer to unwillingness. Rather, it refers to how reluctant a pattern may be to consume some text when there are better matches available elsewhere.

Example

a? matches an a character or nothing.

package main

import (
	"fmt"
	"regexp"
)

func main() {
	var re = regexp.MustCompile(`ba?`)
	var str = `ba b a`

	for i, match := range re.FindAllString(str, -1) {
		fmt.Println(match, "found at index", i)
	}
}

Explanation

  • LineS 4–5: We import the regex and fmt packages.
  • Line 9: We create a variable that parses a regular expression and returns, if successful, a regexp object that can be used to match against the text.
  • Line 10: This is the string or text that line 9 is matched against.
  • Line 12: This is a for loop that prints a slice of all the successive matches of the expression along with their indices.

The code snippet above returns two matches.

Possessive – *

The possessive quantifier requires a matching character to be followed by a specified character. If it is followed by that specified character, then it doesn’t have to be repeated.

Example

a* matches zero or more consecutive a character(s).

package main

import (
    "regexp"
    "fmt"
)

func main() {
    var re = regexp.MustCompile(`ba*`)
    var str = `a ba baa aaa ba b`
    
    for i, match := range re.FindAllString(str, -1) {
        fmt.Println(match, "found at index", i)
    }
}

Explanation

  • Lines 4–5: We import the regex and fmt packages.
  • Line 9: We create a variable that parses a regular expression and returns, if successful, a regexp object that can be used to match against the text.
  • Line 10: This is the string or text that line 9 is matched against.
  • Line 12: This is a for loop that prints a slice of all the successive matches of the expression along with their indices.

The code snippet above returns four matches.

Summary

In order to master regular expressions, RegEx, we must first understand how they work. Once we understand that a single RegEx may be used to search for multiple results, we can begin leveraging their power. However, specific quantifiers are better suited than others for certain situations. Our search result set is affected by the type of quantifier that’s used.

RELATED TAGS

regex
golang
communitycreator

CONTRIBUTOR

Omoyemi John Arigbanla
RELATED COURSES

View all Courses

Keep Exploring