Terminal – How to Find the Common Name for a Particular Glyph

special charactersterminal

Sometimes, I'd like to know the name of a glyph. For example, if I see , I may want to know if it's a hyphen -, an en-dash , an em-dash , or a minus symbol . Is there a way that I can copy-paste this into a terminal to see what it is?

I am unsure if my system knows the common names to these glyphs, but there is certainly some (partial) information available, such as in /usr/share/X11/locale/en_US.UTF-8/Compose. For example,

<Multi_key> <exclam> <question>         : "‽"   U203D # INTERROBANG

Another example glyph: ?.

Best Answer

Try the unicode utility:

$ unicode ‽
U+203D INTERROBANG
UTF-8: e2 80 bd  UTF-16BE: 203d  Decimal: &#8253;
‽
Category: Po (Punctuation, Other)
Bidi: ON (Other Neutrals)

Or the uconv utility from the ICU package:

$ printf %s ‽ | uconv -x any-name
\N{INTERROBANG}

You can also get information via the recode utility:

$ printf %s ‽ | recode ..dump
UCS2   Mne   Description

203D         point exclarrogatif

Or with Perl:

$ printf %s ‽ | perl -CLS -Mcharnames=:full -lne 'print charnames::viacode(ord) for /./g'
INTERROBANG

Note that those give information on the characters that make-up that glyph, not on the glyph as a whole. For instance, for (e with combining acute accent):

$ printf é | uconv -x any-name
\N{LATIN SMALL LETTER E}\N{COMBINING ACUTE ACCENT}

Different from the standalone é character:

$ printf é | uconv -x any-name
\N{LATIN SMALL LETTER E WITH ACUTE}

You can ask uconv to recombine those (for those that have a combined form):

$ printf 'e\u0301b\u0301' | uconv -x '::nfc;::name;'
\N{LATIN SMALL LETTER E WITH ACUTE}\N{LATIN SMALL LETTER B}\N{COMBINING ACUTE ACCENT}

(é has a combined form, but not b́).

Related Question