Regex Extract Matches
Pull all the emails, URLs, IDs, or anything else matching a pattern out of a block of text and into a tidy list. Choose whole matches or a specific capture group, then deduplicate, sort, and join them however you like. It updates as you type and runs locally.
How to use the Regex Extract Matches
Enter a Pattern and paste your text. Every match is collected into the list below, updating live; the count line shows how many items you have. Copy list grabs the result. By default the whole match is extracted, but if your pattern has capture groups you can choose Group 1, 2, or 3 to pull out just that part — for example, capturing the domain from an email or the ID from a URL.
The options shape the output. Deduplicate removes repeated values so each distinct match appears once. Sort orders the list alphabetically (and numerically-aware for natural ordering). Ignore case applies the i flag to matching. Finally, Join with controls how the items are separated in the output box — one per line by default, or comma-, space-, or pipe-separated when you need a single inline string to paste into code or a query.
Matching is global across the whole input, and empty results from an optional group are skipped. It all runs in your browser, so logs, exports, and other sensitive text stay on your machine and the extraction is instant.
Extraction versus searching
Most people first meet regular expressions as a search tool — does this text contain a match, and where? But an equally important use is extraction: gathering every piece of text that fits a pattern into a structured list you can work with. The difference matters because the goal is not to locate or count but to collect. Pulling all email addresses out of a message, every URL from an HTML dump, all the order numbers from a report, or each hashtag from a caption are all the same underlying task — run the pattern globally and keep what it matches.
The key technique that makes extraction powerful is the capture group. A regex match is not just the whole matched string; parentheses in the pattern create numbered sub-captures, and often the part you actually want is inside one of them. Suppose you match href="([^"]+)" against some HTML: the whole match includes the href=" and the closing quote, but group 1 is the clean URL by itself. Choosing which group to extract lets you target exactly the substring you need rather than the scaffolding around it. This is why a good extractor lets you pick the whole match or a specific group — it turns a rough pattern into a precise data puller.
Two post-processing steps round out the workflow. Deduplication is almost always wanted when the same value appears many times — a log that mentions one user ID fifty times should yield one entry, not fifty — and it converts a raw match stream into a set of distinct values. Sorting makes the list scannable and makes diffs between two extractions meaningful. Finally, controlling the delimiter bridges the gap between a human-readable list (one per line) and a machine-ready string (comma-separated for a SQL IN clause, pipe-separated for a quick alternation, and so on). Together these turn the browser into a lightweight data-wrangling step: paste messy text, describe the shape of what you want, and get back a clean, deduplicated, ordered list without writing a script or leaving the page.
Common use cases
- Harvesting values. Pull all emails, URLs, IPs, or IDs out of a document into a list.
- Capture-group targeting. Extract just the part you want — a domain, a filename, a number — via a group.
- Building queries. Deduplicate and join matches into a comma-separated string for SQL or code.
- Cleaning data. Turn repetitive, messy text into a sorted set of distinct values.