CSV Deduplicator

Remove duplicate CSV rows while preserving order.

Open tool

Overview

The CSV deduplicator removes duplicate rows from a pasted CSV while keeping the original row order intact. You choose whether dedup runs across every column or just a subset (the join key or natural identifier), and the first occurrence of each unique row wins.

It's a quick fix when a join, export, or copy-paste has produced repeated entries that need to be flattened before import. Analysts, marketers cleaning contact lists, and developers preparing seed data use a csv dedupe tool to avoid spinning up Pandas or SQL just to filter duplicates.

How it works

Rows are parsed per RFC 4180, including quoted fields with embedded commas and escaped quotes. A signature is computed from the selected columns — by default the full row, joined with a separator that's safe against accidental collisions. A HashSet tracks seen signatures, and any row whose signature is already in the set is dropped.

Order preservation is explicit: the output is emitted in the same sequence as the input, with later duplicates removed rather than the first instance. Trimming and case-folding on the dedup key are opt-in so subtle whitespace differences don't silently collapse rows.

Examples

Input:
email,name
a@x.com,Alice
b@x.com,Bob
a@x.com,Alice
c@x.com,Carol
b@x.com,Bob

Output:
email,name
a@x.com,Alice
b@x.com,Bob
c@x.com,Carol
Dedup key: email only
Input:
email,name
a@x.com,Alice
a@x.com,Alicia

Output:
email,name
a@x.com,Alice

FAQ

Does dedup ignore case and whitespace?

By default no — "alice" and "Alice" are different rows. Enable the case-insensitive and trim options if you want fuzzy equality on the dedup columns.

Can I keep the last duplicate instead of the first?

The current behaviour keeps the first occurrence, which preserves the original order. To keep the last, reverse the input, dedup, then reverse again — or sort the file by the column that tells you which is the freshest record.

What about near-duplicates with tiny formatting differences?

This tool does exact matching after optional trim/case-fold. Fuzzy matching needs a more expensive algorithm (Levenshtein, Jaro-Winkler) and is out of scope here.

Try CSV Deduplicator

An unhandled error has occurred. Reload ×