Terminal – Determine Displayable Characters

fontsterminalunicode

I'm working on a script that displays UTF-8 characters as output. In my Gnome Terminal, this prints out a pretty maple leaf (?):

$ echo -e '\xF0\x9F\x8D\x81'

In rxvt, it prints out a box (the character it uses for "unknown"). locale is UTF-8 for both, but the fonts are different. Is there a way to determine on a user's machine whether certain characters are supported or not?

Best Answer

An application running in a terminal has no way to find out from the terminal what the glyphs that the terminal has drawn look like (or even if they are substitute/placeholder characters).

One thing the application can do is find out if the terminal supports UTF-8 at all, and if it does, if it supports variable width characters. The method is as follows:

  • Read the cursor position by writing ESC [ 6 n and expecting ESC [ line ; col R
  • Write the 2-byte sequence "\xc2\xa0". If the terminal supports UTF-8, this is a single nonbreaking space. If the terminal does not support UTF-8, it's something unknown but which probably occupies 2 character positions (probably  followed by nonbreaking space, in fact).
  • Read the cursor position again and find out of the cursor moved by one position or two positions

If the terminal does support UTF-8, then you can find out if it supports variable characters widths by basically using the same trick. Read the cursor position, write a character which is supposed to be double-width in monospace fonts, such as あ, then read the cursor position again. If the terminal does not support double-width characters, the cursor will probably have naively moved by only one position.