Regular expressions are tiny programs for pattern matching. They shine at “find all invoice numbers that look like INV-####-YY” and fail loudly at “parse arbitrary HTML with nested tables.” Your job is knowing which side of that line you are standing on before you ship regex to production.
Literals vs metacharacters
Most characters match themselves. Metacharacters like . * + ? [] () | change meaning. Escape metacharacters with backslash when you need literal dots in hostnames: toollabz\.com.
Anchors and boundaries save careers
^ and $ anchor to start/end of a line (or string, depending on flags). Word boundaries \b stop “cat” from matching “scatter”. Without anchors, validating an email typed not-an-email might still find a substring that looks like a TLD.
Three practical patterns
- Hex colors:
^#([0-9a-fA-F]3|[0-9a-fA-F]6)$ - ISO-like dates:
^\d4-\d2-\d2$(still does not validate February 30 - regex is not a calendar). - Slugs:
^[a-z0-9]+(?:-[a-z0-9]+)*$
Catastrophic backtracking (the CPU bonfire)
Nested quantifiers like (a+)+$ against slightly mismatched inputs can explode into exponential trial paths on classic NFA engines. Symptoms: regex that “works in unit tests” but wedges production when fed 2 KB of attacker-controlled text. Mitigations: possessive quantifiers where supported, atomic groups, explicit character classes instead of .* soup, or refuse the job and parse with a real tokenizer.
Flags you should set on purpose
Case-insensitive search needs i; multiline line anchors need m; dot-all behavior (if available) changes whether . crosses newlines. Document flags beside every stored pattern - future you will not remember whether ^ meant string start or logical line start.
Regex vs parser
| Task | Regex OK? |
|---|---|
| Extract order IDs from logs | Usually yes with anchors + tests |
| Parse nested JSON | No - use JSON.parse + schema |
Bridge to JSON tooling
After extracting JSON substrings from logs, validate with JSON validator and pretty-print using JSON formatter. For conceptual background, read JSON formatting explained.
Regex tester workflow
Use the regex tester with three fixtures: a matching example, a near-miss, and a malicious counterexample (unicode homoglyphs, extra whitespace). Add global vs non-global flags deliberately - global replacements have surprised many a code review.
Developer hub
More utilities await on the developer tools hub.