Unicode Character Inspector

Inspect every character: name, code point, UTF-8 bytes, block.

Open tool

Overview

Inspect every character of your input. For each character, the inspector shows the visible glyph, the Unicode code point (like U+0041), the official character name (LATIN CAPITAL LETTER A), the Unicode block it lives in, and the UTF-8 byte sequence used to encode it. Combining marks, zero-width characters, and emoji clusters are all surfaced.

Developers debugging encoding issues, font designers checking glyph coverage, security analysts looking for homograph attacks, linguists studying lesser-known scripts, and anyone who's ever been confused by an invisible character all reach for it. It's the swiss-army knife of "what is this character actually".

How it works

The tool iterates each character โ€” actually each grapheme cluster, since visible characters can be multi-codepoint โ€” and looks up its properties from the Unicode database: the official name, general category (letter, mark, symbol, punctuation), block (e.g., Latin Extended-A, Cyrillic, Emoji), and bidirectional class. UTF-8 encoding is computed on the fly: ASCII fits in one byte, most BMP characters use two or three, supplementary-plane characters (most emoji) use four.

Examples

Char:  A
Code:  U+0041
Name:  LATIN CAPITAL LETTER A
Block: Basic Latin
UTF-8: 0x41
Char:  รฉ
Code:  U+00E9
Name:  LATIN SMALL LETTER E WITH ACUTE
Block: Latin-1 Supplement
UTF-8: 0xC3 0xA9
Char:  ๐ŸŽ‰
Code:  U+1F389
Name:  PARTY POPPER
Block: Miscellaneous Symbols and Pictographs
UTF-8: 0xF0 0x9F 0x8E 0x89

FAQ

Why are some "characters" multi-codepoint?

Emoji with skin-tone modifiers, family clusters, flags, and accented letters in NFD form are all single visible glyphs built from multiple Unicode code points. The inspector lists each component.

What's a Unicode block?

Unicode is organized into named ranges called blocks โ€” Basic Latin (0โ€“127), Greek (880โ€“1023), CJK Unified Ideographs, Emoji, and many more. The block tells you what script or symbol family the character belongs to.

How many code points does Unicode define?

The 17 planes contain space for 1,114,112 code points; about 150,000 are assigned as of Unicode 15. Most assigned code points are CJK characters and historical scripts.

Try Unicode Character Inspector

An unhandled error has occurred. Reload ร—