File Entropy

Shannon entropy of a file — detect encryption / compression.

Open tool

Overview

The file entropy calculator measures the Shannon entropy of a file's byte distribution and reports a value between 0 and 8 bits per byte. Low entropy means structured, predictable content like English text or source code; high entropy points to compressed or encrypted data where every byte value is almost equally likely.

Security analysts triaging suspicious files, malware researchers spotting packed binaries, and forensic examiners differentiating archives from encrypted containers reach for this as a first-pass classifier. Long-tail searches that lead here include "calculate Shannon entropy of file", "detect encrypted file by entropy", and "is this file compressed or encrypted".

How it works

Shannon entropy of a byte stream is H = -Σ p_i * log2(p_i) where p_i is the empirical probability of byte value i. The calculator counts the occurrences of each of the 256 possible byte values, divides by the total length to get probabilities, and sums the contribution of every non-zero probability.

The result is interpreted on a familiar scale: plain ASCII text typically lands between 4.0 and 5.0; PDFs around 7.0; well-compressed archives (.zip, .gz) approach 7.9; and properly encrypted blobs sit at 7.99+ with a flat byte histogram. A sliding-window mode reports the entropy of overlapping chunks so localised high-entropy regions (an embedded encrypted resource inside an otherwise normal file) become visible.

Examples

  • Confirm a .bin file is compressed (entropy near 7.9) rather than raw structured data.
  • Spot a high-entropy region inside an otherwise low-entropy executable — a classic packer signature.
  • Distinguish an encrypted payload from a compressed one based on whether the byte histogram is perfectly flat or merely close to it.
  • Detect uncompressed CCTV footage masquerading as a video file when its entropy is suspiciously low.

FAQ

Can entropy alone prove a file is encrypted?
No. Encryption and good compression both produce high entropy. The histogram shape and known headers (PK for ZIP, 1F 8B for gzip) are needed to distinguish them.

Why does my JPEG show high entropy?
JPEG uses entropy coding internally, so the byte distribution is close to flat. This is normal and expected for most modern media formats.

Is windowed entropy more useful than whole-file entropy?
Often yes, especially for executables and forensic artefacts where a single high-entropy region matters more than the average. Use the sliding window when investigating "where" inside the file is suspicious.

What window size should I pick?
A common starting point is 256 to 1024 bytes. Smaller windows are noisier; larger windows smooth out short anomalies.

Does the calculator alter the file?
It is strictly read-only. The byte stream is hashed into a frequency table; no modifications occur.

Try File Entropy

An unhandled error has occurred. Reload ×