How to find out which unicode codepoints are defined in a TTF file

.ttffontsunicode

I need to automate a process of verification which Unicode characters have actual glyphs defined for them in a True Type Font file. How do I go around doing that? I can't seem to find information on how to make sense of the numbers I seem to be getting when I open a .ttf file in a text editor.

Best Answer

I found a python library, fonttools (pypi) that can be used to do it with a bit of python scripting.

Here is a simple script that lists all fonts that have specified glyph:

#!/usr/bin/env python3

from fontTools.ttLib import TTFont
import sys

char = int(sys.argv[1], base=0)

print("Looking for U+%X (%c)" % (char, chr(char)))

for arg in sys.argv[2:]:
    try:
        font = TTFont(arg)

        for cmap in font['cmap'].tables:
            if cmap.isUnicode():
                if char in cmap.cmap:
                    print("Found in", arg)
                    break
    except Exception as e:
        print("Failed to read", arg)
        print(e)

First argument is codepoint (decimal or hexa with 0x) and the rest is font files to look in.

I didn't bother trying to make it work for .ttc files (it requires some extra parameter somewhere).

Note: I first tried the otfinfo tool, but I only got basic multilingual plane characters (<= U+FFFF). The python script finds extended plane characters OK.