vimtricks.wiki Concise Vim tricks, one at a time.

How do I see the UTF-8 byte sequence for the character under the cursor?

Answer

g8

Explanation

g8 shows the UTF-8 byte sequence of the character under the cursor, displaying each byte as a two-digit hexadecimal value in the message area. This is invaluable when debugging encoding issues, identifying look-alike Unicode characters, or verifying that a file contains exactly the bytes you expect.

How it works

  • g — introduces a family of extended normal mode commands
  • 8 — targets UTF-8 byte representation (as in "UTF-8")
  • Output appears in the command area immediately; no buffer modification occurs

The display format is a space-separated sequence of hex pairs, one per byte. Single-byte ASCII characters produce one pair; multibyte Unicode characters produce two to four pairs.

Example

With the cursor on the Euro sign (U+20AC):

g8

Vim displays at the bottom of the screen:

e2 82 ac

These three bytes are the UTF-8 encoding of U+20AC. By contrast, with the cursor on a plain ASCII letter A:

41

A single byte, 0x41 (decimal 65).

Tips

  • Compare with ga, which shows the code point in decimal, hex, and octal — g8 shows the actual storage bytes, not the code point
  • Detect invisible troublemakers: a non-breaking space (U+00A0) shows c2 a0; a zero-width space (U+200B) shows e2 80 8b; a right-to-left mark (U+200F) shows e2 80 8f
  • If two characters look identical on screen but behave differently, g8 reveals whether they are truly the same codepoint or Unicode homoglyphs

Next

How do I sort lines using only the text matched by a regex pattern as the sort key?