How to print Unicode glyph names for input string

command lineunicode

I'd like to be able to run

unicode-names 'abç'

and see the corresponding Unicode character names:

LATIN SMALL LETTER A
LATIN SMALL LETTER B
LATIN SMALL LETTER C WITH CEDILLA

Printing a string as a series of Unicode glyph names would be useful in several cases:

  • Distinguish easily confused characters such as "i" and "í".
  • Explain what a literal string actually contains (for example non-printable or unassigned, zero-width characters).

Best Answer

The uniutils package has the program uniname.

$ echo -n …—|uniname
character  byte       UTF-32   encoded as     glyph   name
    0          0  002026   E2 80 A6       …      HORIZONTAL ELLIPSIS
    1          3  002014   E2 80 94       —      EM DASH