Regular Expressions by Example
Regex is one of those skills that's hard to learn, easy to forget, and constantly useful. This is a working dev's reference — the parts that come up daily, with examples you can paste into the JustKit regex tester and see in action.
Anchors
Anchors aren't characters — they're positions. Match where in the string you are.
^— start of string (or line, with themflag)$— end of string (or line)\b— word boundary (between word and non-word char)\B— anywhere except a word boundary
^cat$ matches the exact string "cat" but not "cats" or " cat". \bcat\b matches "cat" anywhere, including inside "the cat sat" but not inside "category".
Character classes
.— any character (except newline by default;sflag includes newline)\d— digit (0-9)\D— non-digit\w— word char (a-z A-Z 0-9 _)\W— non-word char\s— whitespace (space, tab, newline)\S— non-whitespace[abc]— any one of a, b, c[^abc]— anything except a, b, c[a-z]— range. Inside[], the dash is a range. Outside, it's a literal dash.
Quantifiers
x?— 0 or 1 of xx*— 0 or more of xx+— 1 or more of xx{n}— exactly nx{n,m}— between n and mx{n,}— at least n
By default, quantifiers are greedy — they match as much as possible. Adding a ? after them makes them lazy — match as little as possible. .* matches as much as it can; .*? matches as little as it can. The lazy form is what you want when extracting between delimiters: <.*> on <a>X</a> matches the entire string; <.*?> matches each tag separately.
Groups
(abc)— capturing group. Available as$1,$2, etc., in replacements.(?:abc)— non-capturing group. Same matching behavior, no capture slot. Faster.(?<name>abc)— named capture. Reference as$<name>in replacements.a|b— alternation. Matches a or b.
Groups also let you apply quantifiers to multi-character sequences: (ab)+ matches one or more "ab"s.
Lookarounds (zero-width assertions)
Sometimes you need to match X but only when followed/preceded by Y — without including Y in the match.
X(?=Y)— positive lookahead. Match X only when followed by Y.X(?!Y)— negative lookahead.(?<=Y)X— positive lookbehind. Match X only when preceded by Y.(?<!Y)X— negative lookbehind.
Most modern engines (JS post-ES2018, Python, Go's regexp2, PCRE) support all four. Some older or simpler engines (Go's stdlib regexp for example) only support lookahead. Check before you ship.
The 12 patterns you'll actually use
- Email (good enough):
^[\w.%+-]+@[\w.-]+\.[a-zA-Z]{2,}$— pragmatic, not RFC 5322 perfect (perfect is ~6,000 characters long). - URL (HTTP/HTTPS):
^https?://\S+$— coarse but works. - IPv4:
^(?:(?:25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)\.){3}(?:25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)$ - UUID v4:
^[0-9a-f]{8}-[0-9a-f]{4}-4[0-9a-f]{3}-[89ab][0-9a-f]{3}-[0-9a-f]{12}$(case-insensitive). - Hex color:
^#(?:[0-9a-fA-F]{3}){1,2}$— matches #fff and #ffffff. - Slug:
^[a-z0-9]+(?:-[a-z0-9]+)*$— lowercase, hyphens between, no leading/trailing/double hyphens. - ISO date (YYYY-MM-DD):
^\d{4}-\d{2}-\d{2}$— coarse; doesn't validate that 02-30 is impossible. - Whitespace runs:
\s+— for collapsing multiple spaces. Replace with single space. - Trailing whitespace:
\s+$withmflag — strip per line. - HTML/XML tag:
<([a-z][a-z0-9]*)\b[^>]*>.*?</\1>— naive but fine for snippets. Don't try to fully parse HTML with regex. - Phone (loose international):
^\+?[0-9\s\-()]{7,}$— varies massively by country; loose is best. - Markdown header:
^(#{1,6})\s+(.+)$withmflag — captures level and text.
Flags
i— case insensitiveg— global (find all matches; in JS, also enablesmatchAll)m— multiline (^and$match per line)s— dotall (.matches newlines)u— unicode (treats input as UTF-8 codepoints, supports\p{}properties)
Common mistakes
- Forgetting to escape inside character classes:
[.]matches a literal dot..outside[]matches any char. Inside[]the dot is already literal. - Greedy when you wanted lazy:
".*"on"foo" "bar"matches the entire"foo" "bar". Use".*?". - Not anchoring:
\d{4}matches inside "abc1234567def" because it isn't anchored. Add^and$for full-string match. - Trying to parse HTML with regex: nested tags, attribute quoting, comments, CDATA — regex can't handle the recursion. Use a real parser.
JustKit's role
The JustKit regex tester shows live matches as you type, with each match highlighted in the input. Use it to iterate on a pattern, paste in real-world test data, and verify edge cases before shipping.