Perceptual Image Hash

Compute aHash, dHash and pHash for an image.

Open tool

Overview

The Perceptual Image Hash tool computes three small fingerprints for an uploaded image — aHash (average), dHash (difference) and pHash (DCT-based perceptual hash). Each fingerprint is a 64-bit value rendered as a 16-character hex string. Two images whose hashes differ by only a handful of bits are visually near-duplicates.

This is the foundation of duplicate-photo detection in cloud galleries, copyright matching on user-generated content platforms, finding near-duplicate product shots in an e-commerce catalogue, and any workflow that asks "are these two pictures basically the same?" The Hamming distance between two perceptual image hashes correlates strongly with visual similarity.

How it works

aHash downscales the image to 8x8, converts to grayscale, computes the mean luminance, and produces a 64-bit value where each bit indicates whether that cell sits above or below the mean. dHash uses a 9x8 downscale and compares each cell with its right-hand neighbour, which makes it robust to brightness changes. pHash applies a 32x32 discrete cosine transform, keeps the low-frequency 8x8 block, and threshold-encodes those coefficients against their median — the most robust of the three against resizing and re-compression.

Comparing two fingerprints is a simple Hamming distance: count the number of differing bits. A distance under roughly 5 indicates likely duplicates, 5-15 indicates similar but not identical content, and above 20 the images are unrelated.

Examples

Image A: photo.jpg                aHash 8f3a72e1c5b40d9a
                                  dHash 30a8b5c79e2f1d04
                                  pHash 0fe78f87c7e0e0e1

Image B: photo-resized.jpg        aHash 8f3a72e1c5b40d9a (identical)
                                  dHash 30a8b5c79e2f1d04 (identical)
                                  pHash 0fe78f87c7e0e0e1 (identical)
                                  Hamming distance: 0 → near-duplicate

Image C: unrelated.jpg            pHash 9301a4abd16723f0
                                  Distance to A: 31 → different image

FAQ

Which hash should I use?

dHash and pHash are both more robust than aHash. Use dHash when speed matters (it's the cheapest of the three) and pHash when you need maximum robustness to resizing, compression and small crops.

What Hamming distance counts as a match?

For pHash, a distance under 5 strongly implies a duplicate. 5-15 is a near-duplicate. Above 20 the images are unrelated. Calibrate against your own data set for production use.

Can I detect cropped duplicates?

These hashes work on the whole frame and degrade under heavy cropping. For crop-tolerant matching combine with a region-based approach (e.g. SIFT keypoints).

Are the hashes secure?

No — they are designed to collide on similar images and are not collision-resistant in the cryptographic sense. Don't use them for integrity verification.

How big are the hashes?

Each is 64 bits, fitting in a long or a 16-character hex string. Storing all three takes 24 bytes per image, easy to index in a database.

Try Perceptual Image Hash

An unhandled error has occurred. Reload ×