Original CSV
Changed CSV

CSV Diff: Compare Two CSV Files Online

Paste or upload two CSV files and see the changed rows, prices, and fields side-by-side. Runs in your browser, no signup, no upload.

What is the CSV diff tool?

A free, in-browser tool for comparing two CSV files. Paste yesterday's vendor export on the left, today's on the right, and the changed cells light up. Nothing leaves your machine. There is no account, no upload, no quota.

The diff itself is character-level, with a semantic post-processing pass that pulls highlights onto whole field values rather than scattering them across commas. The grammar this tool assumes is the de facto CSV spec, RFC 4180: comma-separated fields, optional double-quoting, doubled quotes to escape a literal quote, CRLF line endings.

If you have ever opened two 8,000-row vendor price lists in Excel, scrolled column by column trying to spot the SKU whose price moved, and given up, this is the tool that gets you to the change in seconds. For free-form prose use our text diff tool. For structured data with stable keys, JSON diff handles object reordering far better than CSV ever can. XML diff is the right pick for legacy data exports out of older ERPs.

How the diff actually works

The comparison is line-by-line, character-aware. Insertions appear in green on the right, deletions in red on the left. The line-mode pass keeps performance reasonable on 5 MB files, then a character-mode refinement on changed lines surfaces the exact field that moved. For most spreadsheet exports, this is faster and cleaner than running diff in a terminal.

Be honest about what this is not. It is text diff, not a true row-key-aware compare. If you reorder rows by sorting on a different column, every line will look changed even though the underlying data is identical. If you reorder columns, the same thing happens. There is no notion of a primary key here, the way csvkit's csvjoin or a pandas merge on a SKU would give you. Sort both files the same way before pasting, or use one of those tools when you genuinely need set-based comparison.

The header row is data to this tool. Line 1 is treated like every other line, which means a header rename shows as a single-line diff at the top. That is usually what you want when reviewing schema drift. If you only care about body rows, strip the header from both sides before pasting. The CSV format itself, including header conventions, is described in the CSV background article and in RFC 7111, which defines URI fragments for selecting cell ranges within a CSV resource.

How to compare CSV files in three steps

Two text panes, one diff. There is no signup, no upload, no server round-trip.

  1. 1

    Paste or upload your CSV

    Paste the older CSV on the left and the newer one on the right. Or click Upload on either side to load a .csv, .tsv, or .txt file straight from disk. The Sample button fills both panes with a small product price list so you can see the tool in action first.

  2. 2

    Normalise line endings if needed

    Excel saves CSV with CRLF (\r\n) on Windows; tools like Python's csv.writer or a Unix shell pipeline often write LF (\n). If one side is CRLF and the other LF, every line will look changed. Open both in the same editor (or pass through dos2unix) before pasting so endings match. UTF-8 BOM at the start of an Excel-saved CSV is the other classic culprit; strip it if your other file does not have one.

  3. 3

    Read the diff

    Deletions show with a red highlight on the left, insertions with a green highlight on the right. Scroll either side and the other follows. Look for changed prices, new rows added at the bottom, and email addresses corrected mid-list. Header changes appear as a single-line diff on line 1.

When CSV diff is the right tool

Reconciling a vendor inventory CSV against your last export

A supplier sends a fresh price list weekly. Drop last week's file in the left pane, this week's on the right, and SKU-level price moves jump out without you scrolling 4,000 rows in Excel hunting for which cell turned yellow. Doing this in Excel with VLOOKUP against a second sheet is doable but slow; the diff is faster when you just need a quick sanity check before pushing the new prices into your store.

Comparing a database export against a sanitised partner copy

Before sending a customer dump to a partner you redact PII: email becomes redacted@example.com, phone gets masked, full names truncated to first initial. Diff the source export against the sanitised copy and confirm that only the redacted columns changed. If a billing_address column unexpectedly shows up in the diff, your redaction script missed a field, and you catch it before the file leaves your laptop.

Auditing a product feed before pushing to a merchant catalog

Google Merchant Center, Facebook Catalog, and most marketplaces eat a CSV (or TSV) feed. After your CMS regenerates the feed, diff it against last night's known-good copy to catch a regex that stripped trailing zeros from price, an availability column that flipped from in_stock to out_of_stock on every row because of an IF() bug, or a new GTIN field your team added without telling anyone.

Diffing two Salesforce or HubSpot CSV exports

When a sales op asks "which contacts changed since last week?", export the same report twice (then and now), and diff the two files. The CSV will not be ordered by record ID by default, so sort both sides on the ID column first; otherwise every row looks reshuffled. Once sorted, real changes (a lead status moved from MQL to SQL, an email corrected after a bounce) show up clearly.

Comparing pipeline output against yesterday

A nightly ETL job writes a CSV to S3. Today's row count is suspiciously different from yesterday. Pull both files, diff them, and you find the schema drift: a new column appeared mid-table because the upstream API added a field, which shifted every downstream cell by one. The diff catches the structural change instantly, where a row count check on its own would only have told you something was off.

Reviewing an A/B test result CSV from your stats team

Your data scientist ships a CSV with the experiment results. They re-run the analysis after a methodology tweak. Diff v1 against v2: did only the test statistics columns change, or did the assignment counts shift too (which would mean the cohort definition moved)? Honest diffing of analysis outputs is the cheapest sanity check before you ship a feature based on the result.

CSV quick reference

A short cheat sheet for the parsing edge cases this tool surfaces most often. The grammar reference is RFC 4180 plus what real spreadsheet tools actually emit.

TopicWhat this tool does
Quoting per RFC 4180Fields containing commas, double quotes, or newlines must be wrapped in double quotes. "Widget, small" is one field. Plain values like 4.99 need no quotes. See RFC 4180.
Escaped double quotesA literal " inside a quoted field is escaped by doubling it. "He said ""hi""" is the value He said "hi". Backslash escaping is not standard CSV, even though some tools accept it.
Embedded newlinesNewlines are allowed inside a quoted field per RFC 4180. A multi-line address inside one cell is valid CSV. The diff treats each physical line as a unit, so a newline-bearing field can show changes on the wrapping line.
Line endings (CRLF vs LF)RFC 4180 says CRLF (\r\n). Excel emits CRLF. Most Unix tools emit LF. Mixing the two between left and right panes makes every line look changed. Normalise endings before diffing.
BOM (UTF-8)Excel-saved CSVs often start with a UTF-8 BOM (EF BB BF). The Unicode BOM FAQ marks it as optional and unnecessary for UTF-8. A mismatched BOM between the two files produces a phantom diff on line 1.
Delimiter dialectsComma is the default; European Excel exports semicolon (;) because the comma is the decimal separator in many locales. TSV (tab-separated) is a common variant; see the TSV background. Both files must use the same delimiter for a meaningful diff.
Header rowsThis tool treats line 1 as data. A column rename appears as a single-line diff at the top. If you only want to compare body rows, strip the header from both panes before pasting. Most parsers (Python's csv.DictReader, pandas) treat line 1 as a header by convention but the file format does not require it.
EncodingUTF-8 only via the in-browser FileReader. Latin-1 (ISO-8859-1) and Windows-1252 will load but render garbled accented characters. Convert with iconv or VS Code's "Save with Encoding" before pasting.

CSV diff: frequently asked questions

Will reordering CSV rows or columns produce a noisy diff?

Yes — the engine compares characters line by line, so reordering rows or columns will show as differences even when the underlying data is identical. It is a text diff with a few CSV-aware reading aids. There is no row-key matching, no column reordering awareness, and no semantic understanding of types. For that level of comparison, use a tool like csvkit, pandas with a merge on a primary key, or load both files into SQLite and run an EXCEPT query. For day-to-day spreadsheet comparison, this tool covers most of what you need.

How do I handle CRLF line endings from Windows vs LF on Unix?

Excel writes CSV with CRLF (\r\n), Python's csv.writer typically writes LF (\n), and a shell pipeline can produce either depending on flags. If one pane is CRLF and the other LF, the diff sees a difference at every single line. The fix is to normalise both sides before pasting: open them in the same editor and save with one ending, or pipe one through dos2unix or unix2dos. RFC 4180 mandates CRLF, but real-world CSV files use either.

Does the tool handle column reordering?

No. This is a line-by-line text diff. If row 1 has columns in the order sku,name,price on the left and name,sku,price on the right, every cell on every line will look changed because the diff sees raw text, not a column model. Same applies if you sort both files on different columns. Reorder columns and rows so they match before diffing, or reach for a row-key-aware tool like csvkit's csvjoin or a pandas merge when the column model genuinely matters.

How does the tool handle the UTF-8 BOM that Excel adds?

Excel on Windows often writes a UTF-8 byte order mark at the start of a CSV (the bytes EF BB BF), even though the Unicode BOM FAQ says it is unnecessary for UTF-8. If one of your files has a BOM and the other does not, you will see a phantom one-character diff at line 1, column 1. Open the file in a real text editor (VS Code, Notepad++, Sublime) and re-save without a BOM, or strip the first three bytes manually before pasting.

What encodings does the tool accept?

UTF-8 is the only encoding the FileReader path decodes correctly here. A Latin-1 or Windows-1252 file will load but accented characters will look garbled (e.g. café becomes café). UTF-16 with a BOM will look completely broken. Convert non-UTF-8 files first: in the terminal, iconv -f WINDOWS-1252 -t UTF-8 input.csv > out.csv; in VS Code, use "Reopen with Encoding" then "Save with Encoding" to UTF-8. Python's csv module and pandas both let you specify the source encoding when reading.

How big a CSV file can I diff?

Up to a few MB is fine and feels instant. Past 10 MB the browser starts to feel it, and most of the cost is rendering the highlighted diff (not computing it). For very large exports (50 MB+), filter both files down to the columns or rows you actually care about first. csvkit's csvcut and csvgrep, jq for JSON-converted CSV, or a quick awk/grep are all fine pre-passes. Then paste the trimmed slice here.

Privacy and how this works

Your CSV never leaves your browser. The reader, the diff, and the rendering all run on your machine, locally. No analytics on your input, no logs, no "helpful" cloud round-trip. To check, open DevTools, switch to the Network tab, and watch. There are no outbound requests when you compare. The format reference is RFC 4180, and Python's csv module documentation is the reference we cross-check parsing edge cases against.