CSV Deduplicator
Drop a CSV in, pick the columns that define a duplicate, get a deduplicated output. Choose case-insensitive matching, whitespace trimming, or both for forgiving comparison. Useful for cleaning email lists, removing repeated rows from log exports, or normalising address data before import.
How to use the CSV Deduplicator
Paste a CSV (the first row is treated as the header). Check the columns that should define uniqueness — the rest of the columns travel with the row but don’t affect duplicate detection. Toggle case-insensitive and whitespace-trim for fuzzier matching (e.g., “[email protected] ” should equal “[email protected]” for email deduplication). Pick whether to keep the first or last occurrence — last is useful when later rows have more recent data.
About CSV Deduplicator
Deduplicating a CSV in Excel is doable but error-prone: the Remove Duplicates dialog wants the right columns selected, doesn’t offer case-insensitive matching, and silently corrupts long numeric IDs by converting them to scientific notation. Command-line sort -u works for whole-line dedup but can’t target specific columns. This tool gives you precise control: pick the dedup key, control trim and case, and choose first-vs-last semantics.
The output preserves every column and the original row order (minus the dropped duplicates). Header row passes through unchanged. RFC 4180 quoting is respected on both ends — fields with commas and newlines survive. A status line tells you how many input rows there were and how many remain after dedup so you can sanity-check the result.
Common use cases
- Email list cleaning — dedup by email (case-insensitive) before sending a campaign.
- Log file dedup — collapse repeated rows from a multi-source log export.
- Address normalisation — dedup by (street, city, postcode) with trim+case-insensitive matching.
- Survey response dedup — keep the last response per respondent ID when participants resubmit.
Frequently asked questions
Does it modify the kept row in any way?
What if I don't pick any columns?
sort -u minus the sort.