Indent / Whitespace Detector
Detect whether a file uses tabs, 2-space or 4-space indentation.
Overview
Paste a file and the detector reports whether it's indented with tabs, two spaces, four spaces, or a mix - with a confidence percentage based on how many leading-whitespace runs followed each pattern. Useful when staring at a 500-line file in an editor that hasn't been configured to render whitespace.
It's for developers reviewing PRs from contributors using a different convention, normalising a polyglot repo, or writing the [*] block of a new .editorconfig. Reach for it before running a mass reformatter, when investigating why diffs look bigger than they should, or when chasing down a "mixed tabs and spaces" Python error.
How it works
The detector scans each non-empty line, measures leading whitespace, and classifies it as tabs, an even-spaces multiple, or mixed. It then aggregates: the majority pattern wins, with a confidence figure reflecting how many lines agreed. Lines that don't start a new block (continuations of previous lines) are weighted lower so they don't skew the result.
The classification rules favour the common cases - tabs vs 2/4 spaces - but flag anything unusual (3 or 8 spaces, or tab + space combinations) so you can decide whether to normalise.
Examples
- Strong 2-space:
47 of 50 lines start with 2*N spaces -> 94% confidence, 2-space - Tabs:
All leading whitespace is \t characters -> tabs - Mixed (Python warning case):
Some lines use tabs, others 4 spaces -> mixed; check IndentationError - Off-scale:
Lines indented in multiples of 3 -> 3-space (unusual, flagged)
FAQ
What does "mixed" mean in practice?
Different lines use different indentation - tabs in some, spaces in others, or different space counts. Python rejects this (TabError); other languages tolerate it but make diffs noisy.
How many lines does it need to be reliable?
A dozen non-empty indented lines is usually enough for high confidence. Tiny files may report low confidence because the sample size is small.
Does it handle continuation indents?
Lines that look like continuations (inside a parenthesised expression or after a \ line continuation) are weighted lower. The primary metric is "first line of an indented block."
Can I run it on a whole repo at once?
Paste one file at a time. For repo-wide analysis use a script that runs the detector logic per file - or run a Prettier/dprint formatter that normalises automatically.