gzip / bzip2 Header

Decode the header of a .gz or .bz2 file.

Open tool

Overview

The gzip and bzip2 header decoder reads the leading bytes of a .gz or .bz2 file and reports the encoded metadata: compression method, flags, modification time, original filename (if stored), comment, and the trailer's CRC32 and uncompressed size for gzip. It is a quick way to peek at an archive without decompressing it.

Sysadmins inspecting log rotations, security responders looking at quarantined archives, and developers debugging build artefacts reach for this when they need to know what is inside a compressed file before extracting it. Long-tail searches that lead here include "read gzip header without extracting", "decode .gz metadata", and "what's inside this bz2 file".

How it works

A gzip stream starts with a fixed header defined in RFC 1952: magic bytes 1F 8B, compression method, flags byte (FTEXT, FHCRC, FEXTRA, FNAME, FCOMMENT), modification time as a Unix timestamp, extra flags, and the source OS. Optional fields follow according to the flags — extra subfields, the null-terminated original filename, a comment, and a CRC16 of the header. The trailer contains a CRC32 of the uncompressed data and the original size modulo 2³².

A bzip2 file (.bz2) follows a different layout per Julian Seward's reference implementation: magic BZh, a block-size byte (1 through 9 representing 100KB to 900KB block sizes), then a block-pi marker. The decoder reports the magic, block size, and the first block's checksum without running the BWT decompressor.

Examples

  • Confirm a .gz log file was rotated yesterday by reading its modification time.
  • Recover the original filename when a .gz was renamed during transit.
  • Tell a bzip2 file's block size apart from the default to estimate decompression memory cost.
  • Spot a corrupted header (wrong magic bytes) before wasting time on a failing extraction.

FAQ

Does it decompress the file?
No. Only the header (and gzip's tiny trailer) are read. The payload bytes are untouched, so even multi-gigabyte archives parse instantly.

Is the gzip filename always present?
No. The FNAME flag is optional; if a .gz was created with gzip -n, the original filename is omitted.

What is "FTEXT"?
A hint that the original data was text rather than binary. Decompressors generally ignore it and detect on the fly.

Why is the reported size sometimes wrong?
Gzip's ISIZE field stores the uncompressed length modulo 2³² (about 4 GB). For larger originals the value wraps; the file size is still correct, just truncated.

Does it support .tar.gz specifically?
A .tar.gz is just a gzip stream wrapping a TAR archive. The decoder reads the outer gzip header; to see the entries inside, run the TAR inspector on the decompressed bytes.

Try gzip / bzip2 Header

An unhandled error has occurred. Reload ×