Levenshtein Distance

Compute edit distance and similarity between two strings.

Open tool

Overview

The Levenshtein distance between two strings is the minimum number of single-character edits — insertions, deletions, or substitutions — needed to turn one into the other. The tool returns the distance, a normalized similarity score (0 to 1), and optionally the actual sequence of edits.

Developers building fuzzy search, spell-checkers, autocomplete, and DNA-style sequence alignment reach for Levenshtein distance constantly. Data engineers also use it to dedupe customer records when an exact match misses near-duplicates ("Robert Smith" vs "Roberrt Smith").

How it works

The classic algorithm fills a (m+1) × (n+1) dynamic-programming matrix where cell (i, j) holds the distance between the first i characters of the source and the first j characters of the target. Each cell is the minimum of three neighbors plus the edit cost: insertion (cell above + 1), deletion (cell to the left + 1), or substitution (cell diagonal + 0 if characters match, else +1). The final answer sits in the bottom-right corner.

Similarity is reported as 1 − distance / max(len(a), len(b)), giving a 0–1 score useful for thresholds in fuzzy matching.

Examples

A: kitten
B: sitting
Distance: 3   (k→s, e→i, insert g)
A: book
B: back
Distance: 2   (o→a, o→c — wait, two subs gives "back"? Actually book→bock→back)
A: flaw
B: lawn
Distance: 2   (delete f, insert n at end)

FAQ

What's the difference between Levenshtein and Hamming distance?

Hamming distance only counts substitutions and requires equal-length strings. Levenshtein allows insertions and deletions too, so it works on strings of different lengths.

Are insertions and deletions weighted equally?

In the standard Levenshtein algorithm, yes — each costs 1. Weighted variants (Damerau-Levenshtein, custom cost matrices) exist when you want, say, swap-adjacent or vowel-substitution to count differently.

Is it the same as "fuzzy match score"?

Fuzzy-matching libraries often normalize Levenshtein into a 0–100 score. The raw distance is the integer underneath.

Try Levenshtein Distance

An unhandled error has occurred. Reload ×