Greedy vs Lazy Quantifiers
What are the differences between greedy and lazy quantifiers in regex?
Greedy and lazy quantifiers control how much of the input string is matched by the quantifiers in a regex pattern.
Greedy Quantifiers
Greedy quantifiers match as much of the input string as possible. By default, the quantifiers *
, +
, and {n,m}
are greedy. This means they try to match the longest possible substring that fits the pattern.
For example:
#include <iostream>
#include <regex>
int main() {
std::string input = "abc123def456";
std::regex pattern(".*\\d");
std::smatch match;
std::regex_search(input, match, pattern);
std::cout << match.str();
}
abc123def456
In this example, .*\d
matches the entire string because .*
is greedy and matches as many characters as possible before the final digit.
Lazy Quantifiers
Lazy quantifiers match as little of the input string as possible. To make a quantifier lazy, you append a ?
to it. The lazy versions are *?
, +?
, and {n,m}?
.
For example:
#include <iostream>
#include <regex>
int main() {
std::string input = "abc123def456";
std::regex pattern(".*?\\d");
std::smatch match;
std::regex_search(input, match, pattern);
std::cout << match.str();
}
abc1
Here, .*?\d
matches only abc1
because .*?
is lazy and stops matching as soon as it finds a digit.
When to Use Each
Use greedy quantifiers when you want to match the longest possible substring and lazy quantifiers when you want the shortest match.
Understanding the difference helps in fine-tuning regex patterns for specific use cases, such as extracting or replacing substrings within larger text.
Regex Capture Groups
An introduction to regular expression capture groups, and how to use them in C++ with regex_search
, regex_replace
, regex_iterator
, and regex_token_iterator