Regex Catastrophic Backtracking —How to Fix Regex That Freezes Your App
You ship a regex pattern that looks perfectly fine.
It passes tests. It works in Regex101. The code review approves it.
Then in production, a user submits input that makes your Node.js server process spin at 100% CPU for 30 seconds.
The regex did not throw an error. It just never finished.
This is catastrophic backtracking.
It is one of the most dangerous regex issues because:
- the pattern looks correct
- it works on normal input
- it only fails on specific malicious input
- it can bring down production systems
This guide explains what catastrophic backtracking is, how to identify patterns at risk, and how to fix them in JavaScript and Python.
If you want to test potentially dangerous patterns safely, the Regex Tester helps visualize backtracking behavior.
What Is Catastrophic Backtracking?
Catastrophic backtracking happens when a regex pattern has multiple ways to match the same text, and the regex engine tries all of them exponentially.
The classic example:
(a+)+b
This pattern looks for:
- one or more
acharacters, repeated one or more times - followed by
b
On input like "aaaaaaaaaaaaaaaaaaaaaaaaaaaaa" (no b at the end), the engine tries every possible combination of a+ groupings before giving up.
For a string of 30 a characters, that is billions of combinations.
Why It Happens
Regex engines use backtracking to explore alternatives.
When the engine cannot find a match after consuming characters greedily, it "backtracks" —steps back and tries a different combination.
With nested repetition, the number of backtracking steps grows exponentially with input length.
Input: "aaaa"
Pattern: (a+)+b
The engine tries:
- (aaaa), (aaa)(a), (aa)(aa), (aa)(a)(a), (a)(aaa), (a)(aa)(a), (a)(a)(aa), (a)(a)(a)(a)
Before concluding no match exists. That is 8 combinations for just 4 characters. At 30 characters, it is over a billion.
The Most Dangerous Patterns
Patterns with nested repetition are the most common cause:
(a+)+ # Nested quantifiers
([a-zA-Z]+)* # Quantifier inside quantifier
(a|aa|aaa)+ # Alternation with overlapping options
(x*)* # Star inside star
(.+\s+)+ # Multiple greedy quantifiers
Any pattern where one repetition contains another repetition is suspicious.
Real-World Example: Email Validation
A naive email regex that causes catastrophic backtracking:
^([a-zA-Z0-9_\-\.]+)@([a-zA-Z0-9_\-\.]+)\.([a-zA-Z]{2,5})$
On a long invalid input like:
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa@
The engine tries every possible split of the @ character before failing.
Related reading: Best Regex for Email Validation in JavaScript
Real-World Example: HTML Tag Matching
<([a-z]+)([^<]+)*(?:>(.*)<\/\1>|\s+\/>)
This pattern attempts to match HTML tags. On malformed HTML, it can create catastrophic backtracking because of the overlapping * quantifiers.
Related reading: You Should Not Parse HTML with Regex —But Here's Why Everyone Tries
JavaScript: A Dangerous Pattern
// DANGEROUS —catastrophic backtracking on non-matching input
const regex = /^(a+)+b$/;
function test(pattern, input) {
const start = performance.now();
try {
const result = pattern.test(input);
const elapsed = performance.now() - start;
console.log(`Result: ${result}, Time: ${elapsed.toFixed(2)}ms`);
} catch (e) {
console.log(`Error: ${e.message}`);
}
}
// Short input —fast
test(regex, "aaaaaab"); // Fast —matches
// Long non-matching input —very slow
test(regex, "aaaaaaaaaaaaaaaaaaaaaaaaaaaa"); // Could take seconds
If you run this, you will see the second call take dramatically longer —possibly timing out entirely.
Python: Same Pattern, Same Problem
import re
import time
# DANGEROUS
pattern = re.compile(r'^(a+)+b$')
def test_match(text):
start = time.perf_counter()
match = pattern.match(text)
elapsed = time.perf_counter() - start
print(f"Result: {bool(match)}, Time: {elapsed:.4f}s")
test_match("aaaaaab") # Fast
test_match("aaaaaaaaaaaaaaaaaaaaaaaaaaaa") # Very slow (or timeout)
How to Identify Catastrophic Backtracking
1. Look for Nested Quantifiers
(a+)+ # Nested +
(\w*)* # Nested *
(\d+)* # Mixed nested
2. Look for Alternation with Overlap
(a|aa|aaa)+ # Overlapping alternatives
(\d|\w)+ # Overlapping character classes
3. Look for Greedy Quantifiers Followed by Optional Parts
.*\d+ # Greedy .* followed by \d+
4. Test with Long Non-Matching Input
If a regex takes significantly longer on non-matching input than matching input, catastrophic backtracking is likely.
How to Fix Catastrophic Backtracking
Fix 1: Remove Nested Quantifiers
Instead of:
(a+)+b
Use:
a+b
If you need one or more a followed by b, just use a+b. The outer group is unnecessary.
Fix 2: Use Possessive Quantifiers (Where Supported)
Possessive quantifiers prevent backtracking:
(a++)+b # a++ is possessive —never backtracks
JavaScript does NOT support possessive quantifiers natively. Python does not either (in the re module). PCRE, Java, and .NET support them.
In JavaScript, use atomic groups via lookahead:
// Simulate possessive quantifier
const regex = /^(?=(a+))\1+b$/;
Fix 3: Use Atomic Groups
(?>a+)+b
Atomic groups commit to what they match and do not backtrack.
JavaScript does not support atomic groups natively. Python does not either.
But the technique works in PCRE, Java, and .NET.
Fix 4: Use Character Classes Instead of Alternation
Instead of:
(a|b|c|d)+
Use:
[a-d]+
Character classes are atomic —the engine does not backtrack between alternatives.
Fix 5: Anchor Early
Anchors limit where the engine searches:
// Unanchored —could backtrack across the entire string
const bad = /(\d+)+/;
// Anchored —limits backtracking to the full string
const good = /^(\d+)+$/;
Fix 6: Use String.prototype.includes() for Simple Cases
Sometimes the simplest fix is avoiding regex entirely:
// Instead of catastrophic regex
if (/^.*foo.*$/.test(input)) {
// ...
}
// Use includes()
if (input.includes("foo")) {
// ...
}
JavaScript: ReDoS Prevention Checklist
Before deploying any regex to production, check:
- Are there nested quantifiers? →Fix or simplify
- Does the alternation overlap? →Reorder or use character classes
- What happens with long input (100+ characters)? →Test it
- Is the regex exposed to user input? →Add input length limits
- Could the regex be part of a hot path? →Optimize or cache
Related reading: Common Regex Mistakes Developers Keep Making
Timeout Approaches
If you cannot fix the regex immediately, add timeouts:
function safeTest(regex, text, timeoutMs = 1000) {
return new Promise((resolve, reject) => {
const timer = setTimeout(() => {
reject(new Error("Regex timed out"));
}, timeoutMs);
// Use setImmediate or nextTick to avoid blocking the event loop
setImmediate(() => {
try {
const result = regex.test(text);
clearTimeout(timer);
resolve(result);
} catch (e) {
clearTimeout(timer);
reject(e);
}
});
});
}
This is a workaround, not a solution. The better fix is always to fix the regex.
Python: Timeout with signal
import signal
class TimeoutError(Exception):
pass
def handler(signum, frame):
raise TimeoutError("Regex timed out")
def safe_match(pattern, text, timeout_sec=1):
signal.signal(signal.SIGALRM, handler)
signal.alarm(timeout_sec)
try:
result = pattern.match(text)
signal.alarm(0)
return result
except TimeoutError:
return None
Safer Pattern Design Principles
Principle 1: Avoid Nested Quantifiers
# Bad
(a+)+
(\w*)*
# Good
a+
\w*
Principle 2: Be Specific
# Bad —broad, backtracking-prone
.*stuff.*
# Good —specific
stuff
Principle 3: Use Lazy Quantifiers When Appropriate
# Greedy —more backtracking
.*end
# Lazy —less backtracking
.*?end
Related reading: Regex Greedy vs Lazy Matching Explained Simply
ReDoS: Regular Expression Denial of Service
Catastrophic backtracking is a security vulnerability.
Attackers can craft input that triggers exponential backtracking in your regex, causing:
- CPU exhaustion
- denial of service
- application timeouts
- cascading failures in microservices
Public npm packages have been vulnerable to ReDoS. Always treat regex patterns in authentication, validation, and data parsing as potential attack surfaces.
Tools for Detecting Dangerous Patterns
Several tools can detect catastrophic backtracking:
- Regex101 debugger —shows backtracking steps
safe-regexnpm package —checks for exponential patternsrxxr2—ReDoS analyzer- Static analysis in ESLint plugins
Use these in CI pipelines to catch dangerous patterns before deployment.
FAQ
What is catastrophic backtracking in regex?
Catastrophic backtracking occurs when nested repetition or overlapping alternation causes the regex engine to explore an exponential number of matching combinations.
What causes catastrophic backtracking?
Nested quantifiers like (a+)+, overlapping alternatives like (a|aa|aaa)+, and certain greedy patterns combined with specific inputs.
How do I fix catastrophic backtracking?
Remove nested quantifiers, use character classes instead of alternation, add anchors, or use atomic groups (where supported).
Does JavaScript support atomic groups?
No. JavaScript does not support atomic groups or possessive quantifiers natively. You must restructure the pattern.
What is ReDoS?
Regular Expression Denial of Service —a security attack that uses crafted input to trigger catastrophic backtracking, causing CPU exhaustion.
How do I test for catastrophic backtracking?
Test your regex with long (100+ character) non-matching input. If it takes significantly longer than matching input, you have a problem.
Can catastrophic backtracking happen in Python?
Yes. Python's re module is vulnerable to the same patterns.
Final Thoughts
Catastrophic backtracking is one of the few regex issues that can cause real production outages.
The dangerous patterns look innocent:
(a+)+b
One nested quantifier. That is all it takes.
The fix is usually simple: remove the nesting, use character classes, anchor the pattern, or use a string method instead.
The hard part is knowing to look for it. Most developers only discover catastrophic backtracking when a production incident forces them to.
Test your regex patterns against long inputs. Run them through ReDoS detection tools. Review patterns for nested quantifiers.
And when in doubt, the Regex Tester lets you test patterns against various inputs to catch performance issues before they reach production.
You may also find these related developer tools useful: