Reading Basic Regex
Character classes, quantifiers, anchors, and common patterns in plain English
Basic regex vocabulary
- ^ and $ — start and end anchors;
^...$matches the whole string - [A-Za-z] — character class; \\d = digit; \\w = word char; \\s = whitespace
- + = one or more; * = zero or more; ? = optional (zero or one); {n} = exactly n
- . — any character except newline; \\. — literal dot
- /i flag — case-insensitive; /g flag — global (find all matches)
Question 0 of 5
How would you describe this regex in plain English? /^[A-Za-z]+$/
Entirely letters only, one or more, from start to end. Basic regex vocabulary:
- ^ — start of string (or start of line in multiline mode)
- $ — end of string (or end of line)
- [A-Za-z] — character class: any letter A-Z uppercase or a-z lowercase
- + — one or more of the preceding element
- * — zero or more; ? — zero or one; {n} — exactly n; {n,m} — between n and m
What does this regex match? /\d{3}-\d{3}-\d{4}/
Three digits, dash, three digits, dash, four digits — US phone number format. Digit and quantifier vocabulary:
- \d — any digit 0-9 (equivalent to [0-9])
- \d{3} — exactly 3 digits
- \d{4} — exactly 4 digits
- - — literal hyphen (no escaping needed outside character classes)
- \w — word character: [A-Za-z0-9_]
- \s — whitespace: space, tab, newline
- \D, \W, \S — negations: not digit, not word char, not whitespace
A validator uses: /^[\w.-]+@[\w.-]+\.[a-z]{2,}$/i. Describe what this regex validates.
An email address format — local part @ domain . TLD (2+ letters), case-insensitive. Breaking it down:
^— start of string[\w.-]+— local part: word chars, dots, hyphens; + = one or more@— literal @ sign[\w.-]+— domain name\.— literal dot (backslash escapes the special meaning of .)[a-z]{2,}— TLD: at least 2 lowercase letters$— end of string/iflag — case-insensitive matching
What does the dot . match in regex, and how is it different from \.?
. = any character (except newline); \. = literal dot. Escaping in regex:
.— metacharacter; matches any single character except newline (with thesflag it also matches newline)\.— escaped dot; matches only a literal period character- Other metacharacters that need escaping for literal use:
\( \) \[ \] \{ \} \* \+ \? \^ \$ \| \\
/1.0/ would match "1.0", "1X0", "1 0", etc. The correct regex for a literal "1.0" is /1\.0/. In code review: "This regex uses an unescaped dot — did you mean to match any character, or just a literal dot?"A regex uses: /colou?r/. What does the ? quantifier do here?
? makes the preceding element optional (zero or one) — matches both "color" and "colour". Quantifier vocabulary:
- ? — zero or one; the element is optional
- * — zero or more; greedy by default (matches as much as possible)
- + — one or more
- {n} — exactly n; {n,} — n or more; {n,m} — between n and m
- Greedy vs lazy:
.*is greedy (matches as much as possible);.*?is lazy (matches as little as possible)
/colou?r/: "Matches the literal string 'col', then optionally 'u', then 'r' — so it accepts both American and British spellings."