Xterm not displaying unicode

unicodexterm

i have never been able to get my terminal to display unicode symbols. for example, before i had my present os, i mapped ctrl+a to the greek mu in vim, and it works on other computers, but not on my current xterm. here is the relevant section of my .vimrc:

set encoding=utf-8
"map control-a to mu
imap <C-a> <C-k>m*

also, i need to output sympy equations in python, and this works on other computers, but not on my current xterm. instead of this:

$ python
Python 2.7.3 (default, Mar 14 2014, 11:57:14) 
[GCC 4.7.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import sympy
>>> x = sympy.symbols('x')
>>> sympy.init_printing()
>>> (sympy.sqrt(x**3/(x+1)), 1)
⎛     _______   ⎞
⎜    ╱    3     ⎟
⎜   ╱    x      ⎟
⎜  ╱   ───── , 1⎟
⎝╲╱    x + 1    ⎠

i get this:

>>> (sympy.sqrt(x**3/(x+1)), 1)
n      -------   n
n     n    3     n
n    n    x      n
n   n   ───── , 1n
nnnn    x + 1    n

infact it seems to just use the n character whenever it can't display a unicode character.

i'm running xterm from an ~/.xinitrc file and setting some fonts and colors for the terminal in ~/.Xresources. here is all the relevant information i could think of:

$ uname -a
Linux mypcname 3.2.0-4-amd64 #1 SMP Debian 3.2.60-1+deb7u3 x86_64 GNU/Linux
$ xterm -version
XTerm(278)
$ cat ~/.xinitrc 
#!/bin/bash

#update the xterm colors, font size, etc
[[ -f ~/.Xresources ]] && xrdb -merge ~/.Xresources

# run the window manager in the background first
metacity &

# get the window manager process id
wm_pid=$!

# wait a little while for the window manager to load (extend this if the xterm is not being properly maximised)
sleep 2

# run the xterm in fullscreen
#xterm +u8 -js -fullscreen &
xterm -en en_AU.UTF-8 -js -fullscreen &

# do not let the window manager become a zombie
wait $wm_pid

# this would run xterm first, then the window manager. doesn't maximise properly the first time startx is run
#xterm -fullscreen &
#exec mutter

$ cat ~/.Xresources 
! see man xterm under the resources heading for explanations
! run `xrdb -merge ~/.Xresources` after altering this file
! run `xrdb -query -all` to see the current settings

xterm.vt100.faceName: Terminus
xterm.vt100.faceSize: 14
! do not display bold fonts in bold
xterm.vt100.AllowBoldFonts: false
! display bold fonts in a different color to make them stand out
xterm.vt100.colorBDMode: true
! use green as the bold color (same as in ~/.bashrc)
xterm.vt100.colorBD: #98E34D

! cols x lines ... update with values from $(echo $COLUMNS) and $(echo $LINES)
xterm.vt100.geometry: 126x52

! dark green foreground (same as in ~/.bashrc)
*foreground: #4E9A06
! black background
*background: #000000

! scroll quickly
xterm*fastScroll: true

! enable utf-8 encoding
xterm*locale: true
xterm*utf8: 1

! flash the current line instead of making the bell sound
*visualBell: true
*visualBellLine: true

! black
*color0: #2E3436
! darkred
*color1: #CC0000
! dark green
*color2: #4E9A06
! brown
*color3: #C4A000
! darkblue
*color4: #3465A4
! darkmagenta
*color5: #75507B
! darkcyan
*color6: #06989A
! lightgrey
*color7: #D3D7CF
! darkgrey
*color8: #555753
! red
*color9: #EF2929
! green
*colorA: #8AE234
! yellow
*colorB: #FCE94F
! blue
*colorC: #729FCF
! magenta
*colorD: #AD7FA8
! cyan
*colorE: #34E2E2
! white
*colorF: #EEEEEC

$ tail -10 .bashrc
PATH=/usr/local/bin:/usr/bin:/bin:/sbin

export LC_ALL=en_AU.UTF-8
export LANG=en_AU.UTF-8
export LANGUAGE=en_AU.UTF-8

# final logon actions:

# go straight to x on login. only do this for tty1 so that we can still use the other tty consoles without starting x. also only do this when there is not already a display, otherwise the xterm will try and do this after x starts aswell
[[ -z $DISPLAY ]] && [[ $(tty) = /dev/tty1 ]] && startx

$ locale
LANG=en_AU.UTF-8
LANGUAGE=en_AU.UTF-8
LC_CTYPE="en_AU.UTF-8"
LC_NUMERIC="en_AU.UTF-8"
LC_TIME="en_AU.UTF-8"
LC_COLLATE="en_AU.UTF-8"
LC_MONETARY="en_AU.UTF-8"
LC_MESSAGES="en_AU.UTF-8"
LC_PAPER="en_AU.UTF-8"
LC_NAME="en_AU.UTF-8"
LC_ADDRESS="en_AU.UTF-8"
LC_TELEPHONE="en_AU.UTF-8"
LC_MEASUREMENT="en_AU.UTF-8"
LC_IDENTIFICATION="en_AU.UTF-8"
LC_ALL=en_AU.UTF-8

$ printenv XTERM_LOCALE
en_AU.UTF-8

$ xrdb -query -all
*background:    #000000
*color0:    #2E3436
*color1:    #CC0000
*color2:    #4E9A06
*color3:    #C4A000
*color4:    #3465A4
*color5:    #75507B
*color6:    #06989A
*color7:    #D3D7CF
*color8:    #555753
*color9:    #EF2929
*colorA:    #8AE234
*colorB:    #FCE94F
*colorC:    #729FCF
*colorD:    #AD7FA8
*colorE:    #34E2E2
*colorF:    #EEEEEC
*foreground:    #4E9A06
*visualBell:    true
*visualBellLine:    true
xterm*fastScroll:   true
xterm*locale:   true
xterm*utf8: 1
xterm.vt100.AllowBoldFonts: false
xterm.vt100.colorBD:    #98E34D
xterm.vt100.colorBDMode:    true
xterm.vt100.faceName:   Terminus
xterm.vt100.faceSize:   14
xterm.vt100.geometry:   126x52

how can i get utf-8 working to display greek symbols in vim and equations in sympy?


extra information requested

$ echo $TERM
xterm
$ appres XTerm
*form.Thickness:    0
*tekMenu*tekreset*Label:    RESET
*tekMenu*tektext2*Label:    #2 Size Characters
*tekMenu*tekhide*Label: Hide Tek Window
*tekMenu*tekcopy*Label: COPY
*tekMenu*tektext3*Label:    #3 Size Characters
*tekMenu*vtshow*Label:  Show VT Window
*tekMenu*tektextsmall*Label:    Small Characters
*tekMenu*vtmode*Label:  Switch to VT Mode
*tekMenu*tektextlarge*Label:    Large Characters
*tekMenu*tekpage*Label: PAGE
*tekMenu.Label: Tek Options
*mainMenu*redraw*Label: Redraw Window
*mainMenu*sunKeyboard*Label:    VT220 Keyboard
*mainMenu*terminate*Label:  Send TERM Signal
*mainMenu*backarrow key*Label:  Backarrow Key (BS/DEL)
*mainMenu*logging*Label:    Log to File
*mainMenu*hpFunctionKeys*Label: HP Function-Keys
*mainMenu*kill*Label:   Send KILL Signal
*mainMenu*num-lock*Label:   Alt/NumLock Modifiers
*mainMenu*print-immediate*Label:    Print-All Immediately
*mainMenu*scoFunctionKeys*Label:    SCO Function-Keys
*mainMenu*quit*Label:   Quit
*mainMenu*alt-esc*Label:    Alt Sends Escape
*mainMenu*print-on-error*Label: Print-All on Error
*mainMenu*tcapFunctionKeys*Label:   Termcap Function-Keys
*mainMenu*meta-esc*Label:   Meta Sends Escape
*mainMenu*toolbar*Label:    Toolbar
*mainMenu*print*Label:  Print Window
*mainMenu*suspend*Label:    Send STOP Signal
*mainMenu*delete-is-del*Label:  Delete is DEL
*mainMenu*print-redir*Label:    Redirect to Printer
*mainMenu*fullscreen*Label: Full Screen
*mainMenu*continue*Label:   Send CONT Signal
*mainMenu*oldFunctionKeys*Label:    Old Function-Keys
*mainMenu*securekbd*Label:  Secure Keyboard
*mainMenu*interrupt*Label:  Send INT Signal
*mainMenu*8-bit control*Label:  8-Bit Controls
*mainMenu*allowsends*Label: Allow SendEvents
*mainMenu*sunFunctionKeys*Label:    Sun Function-Keys
*mainMenu*hangup*Label: Send HUP Signal
*mainMenu.Label:    Main Options
*VT100.utf8Fonts.font4: -misc-fixed-medium-r-normal--13-120-75-75-c-80-iso10646-1
*VT100.utf8Fonts.font2: -misc-fixed-medium-r-normal--8-80-75-75-c-50-iso10646-1
*VT100.utf8Fonts.font6: -misc-fixed-medium-r-normal--20-200-75-75-c-100-iso10646-1
*VT100.utf8Fonts.font5: -misc-fixed-medium-r-normal--18-120-100-100-c-90-iso10646-1
*VT100.utf8Fonts.font3: -misc-fixed-medium-r-normal--14-130-75-75-c-70-iso10646-1
*VT100.utf8Fonts.font:  -misc-fixed-medium-r-semicondensed--13-120-75-75-c-60-iso10646-1
*VT100.font4:   7x13
*VT100.font2:   5x7
*VT100.font6:   10x20
*VT100.font5:   9x15
*VT100.font3:   6x10
*VT100.font1:   nil2
*vtMenu*selectToClipboard*Label:    Select to Clipboard
*vtMenu*reversewrap*Label:  Enable Reverse Wraparound
*vtMenu*softreset*Label:    Do Soft Reset
*vtMenu*cursesemul*Label:   Enable Curses Emulation
*vtMenu*autolinefeed*Label: Enable Auto Linefeed
*vtMenu*hardreset*Label:    Do Full Reset
*vtMenu*visualbell*Label:   Enable Visual Bell
*vtMenu*appcursor*Label:    Enable Application Cursor Keys
*vtMenu*clearsavedlines*Label:  Reset and Clear Saved Lines
*vtMenu*bellIsUrgent*Label: Enable Bell Urgency
*vtMenu*appkeypad*Label:    Enable Application Keypad
*vtMenu*tekshow*Label:  Show Tek Window
*vtMenu*poponbell*Label:    Enable Pop on Bell
*vtMenu*scrollbar*Label:    Enable Scrollbar
*vtMenu*scrollkey*Label:    Scroll to Bottom on Key Press
*vtMenu*tekmode*Label:  Switch to Tek Mode
*vtMenu*scrollttyoutput*Label:  Scroll to Bottom on Tty Output
*vtMenu*jumpscroll*Label:   Enable Jump Scroll
*vtMenu*cursorblink*Label:  Enable Blinking Cursor
*vtMenu*vthide*Label:   Hide VT Window
*vtMenu*allow132*Label: Allow 80/132 Column Switching
*vtMenu*reversevideo*Label: Enable Reverse Video
*vtMenu*titeInhibit*Label:  Enable Alternate Screen Switching
*vtMenu*altscreen*Label:    Show Alternate Screen
*vtMenu*keepSelection*Label:    Keep Selection
*vtMenu*autowrap*Label: Enable Auto Wraparound
*vtMenu*activeicon*Label:   Enable Active Icon
*vtMenu.Label:  VT Options
*SimpleMenu*menuLabel.font: -adobe-helvetica-bold-r-normal--*-120-*-*-*-*-iso8859-*
*SimpleMenu*menuLabel.vertSpace:    100
*SimpleMenu*Sme.height: 16
*SimpleMenu*BackingStore:   NotUseful
*SimpleMenu*HorizontalMargins:  16
*SimpleMenu*Cursor: left_ptr
*SimpleMenu*borderWidth:    2
*menubar.borderWidth:   0
*tek4014*fontLarge: 9x15
*tek4014*font2: 8x13
*tek4014*font3: 6x13
*tek4014*fontSmall: 6x10
*MenuButton*borderWidth:    0
*fontMenu*render-font*Label:    TrueType Fonts
*fontMenu*fontdefault*Label:    Default
*fontMenu*font6*Label:  Huge
*fontMenu*allow-window-ops*Label:   Allow Window Ops
*fontMenu*utf8-mode*Label:  UTF-8 Encoding
*fontMenu*font1*Label:  Unreadable
*fontMenu*fontescape*Label: Escape Sequence
*fontMenu*utf8-fonts*Label: UTF-8 Fonts
*fontMenu*fontsel*Label:    Selection
*fontMenu*allow-bold-fonts*Label:   Bold Fonts
*fontMenu*utf8-title*Label: UTF-8 Titles
*fontMenu*font-linedrawing*Label:   Line-Drawing Characters
*fontMenu*font2*Label:  Tiny
*fontMenu*allow-color-ops*Label:    Allow Color Ops
*fontMenu*font-doublesize*Label:    Doublesized Characters
*fontMenu*font3*Label:  Small
*fontMenu*allow-font-ops*Label: Allow Font Ops
*fontMenu*font-loadable*Label:  VT220 Soft Fonts
*fontMenu*font4*Label:  Medium
*fontMenu*allow-tcap-ops*Label: Allow Termcap Ops
*fontMenu*font-packed*Label:    Packed Font
*fontMenu*font5*Label:  Large
*fontMenu*allow-title-ops*Label:    Allow Title Ops
*fontMenu.Label:    VT Fonts
*colorD:    #AD7FA8
*color5:    #75507B
*backarrowKeyIsErase:   true
*colorE:    #34E2E2
*color6:    #06989A
*ptyInitialErase:   true
*colorF:    #EEEEEC
*background:    #000000
*color7:    #D3D7CF
*saveLines: 1024
*color8:    #555753
*color0:    #2E3436
*foreground:    #4E9A06
*IconFont:  nil2
*color9:    #EF2929
*color1:    #CC0000
*visualBell:    true
*colorA:    #8AE234
*color2:    #4E9A06
*visualBellLine:    true
*colorB:    #FCE94F
*color3:    #C4A000
*colorC:    #729FCF
*color4:    #3465A4
$ xterm -u8 -fa "DejaVu Sans Mono"
# the following is typed in the resulting terminal:
$ echo -e "\xE2\x98\xA0"
n
# however when i copy the result from `echo -e "\xE2\x98\xA0"`
# into my browser, i get this: ☠ (a skull) but it does not show
# up as a skull in my xterm

$ lsof -p $PPID | grep fonts
xterm   5990 me  mem    REG              254,1     4971 13501810 /usr/share/fonts/X11/misc/ter-u18b_iso-8859-1.pcf.gz
xterm   5990 me  mem    REG              254,1     4897 13505403 /usr/share/fonts/X11/misc/ter-u18n_iso-8859-1.pcf.gz

i also ran $ fc-list but the output was too large to paste into this question. so i have put it here

what it shows in my browser:

/usr/share/fonts/truetype/freefont/FreeSansBold.ttf: FreeSans:style=Bold,получерен,negreta,tučné,fed,Fett,Έντονα,Negrita,Lihavoitu,Gras,Félkövér,Grassetto,Vet,Halvfet,Pogrubiony,Negrito,gros,Полужирный,Fet,Kalın,huruf tebal,жирний,Krepko,treknraksts,pusjuodis,đậm,Lodia,धृष्ट

what i see in my terminal:

/usr/share/fonts/truetype/freefont/FreeSansBold.ttf: FreeSans:style=Bold,nnnnnnnnn,negreta,tunné,fed,Fett,nnnnnn,Negrita,Lihavoitu,Gras,Félkövér,Grassetto,Vet,Halvfet,Pogrubiony,Negrito,gros,nnnnnnnnnn,Fet,Kalın,huruf tebal,nnnnnn,Krepko,treknraksts,pusjuodis,nậm,Lodia,nnn

interestingly, some "special" characters do show up in my terminal, but most are relaced by n. you can see in the previous output that none of получерен can be displayed, but the final character of tučné can be displayed (while the middle č cannot – it is replaced by n)


as per @apaul's comments it seems that xterm isn't loading the right font. try to set a dummy class so it doesn't load the xterm resources:

$ xterm -class Foo -name foo -u8 -fa "DejaVu Sans Mono:style=Book"
$ # the following commands are all executed in the resulting terminal:
$ echo -e "\xE2\x98\xA0"
☠
$ # the above skull actually shows up now. and so does the unicode
$ # output from sympy and also vi can display greek symbols now :)

all that remains is to figure out why xterm cannot set the font using ~/.Xresoureces, and to get this working. it seems like something must be overriding the font settings?

actually i just thought to try above command with the terminus font, and it seems that this is the problem:

$ xterm -class Foo -name foo -u8 -fa "Terminus"
$ # the following commands are all executed in the resulting terminal:
$ echo -e "\xE2\x98\xA0"
n

maybe terminus is not properly installed? or is being mapped to something else. how could i find that out?

Best Answer

Writing in 2016, talking about xterm patch #278 (released in 2012):

xterm uses a single font, rather than font sets which are supported by several other terminals. The pseudo-graphic characters in this (pasted from xterm):

⎛     ⎽⎽⎽⎽⎽⎽⎽   ⎞
⎜    ╱    3     ⎟
⎜   ╱    x      ⎟
⎜  ╱   ───── , 1⎟
⎝╲╱    x + 1    ⎠

are not provided by the TypeType font specified here:

xterm.vt100.faceName: Terminus
xterm.vt100.faceSize: 14

Other terminals, given that font would provide those characters from another font.

The way to make xterm work is

  • specify a font which does cover all of the characters needed, and
  • tell it to use UTF-8 encoding.

The latter is addressed for most users by the default setting of the locale resource: xterm will (usually) use UTF-8 encoding. But the default behavior is VT100-compatible, hence the use of ISO-8859-1 compatible fonts.

  • Terminus uses more glyphs than that, but falls far short of covering all pseudo-graphics in Unicode.
  • The ones that display as n are U+239B, U+239C, U+239D, U+239E, U+23A0.
  • The version of Terminus in Debian 7 (and Debian testing) has less than 256 glyphs and happens to show n as described in the question.

That happens because (although xterm knows that the glyphs are missing), it has printed the string using the font, assuming that (like most other fonts) missing entries will be shown as blanks. In this case, the freetype library seems to be mapping the low-order byte of the Unicode values into the range that Terminus supports. That happens to fall in a range that the font displays as n (for "no such character"):

enter image description here

The quick workaround uses the uxterm script, which selects a different font and ensures that UTF-8 encoding is used.

Further reading:

Terminus Font is a clean, fixed width bitmap font, designed for long (8 and more hours per day) work with computers. Version 4.40 contains 1241 characters, covers about 120 language sets and supports ISO8859-1/2/5/7/9/13/15/16, Paratype-PT154/PT254, KOI8-R/U/E/F, Esperanto, many IBM, Windows and Macintosh code pages, as well as the IBM VGA, vt100 and xterm pseudographic characters.

The above was talking about xterm patch #278 which was four years old in 2016. Development of xterm is ongoing, and beginning with patch #338 (late 2018) there is support for TrueType fontsets. Here is a screenshot using the OP's resource-settings from xterm patch #342 (#343 will probably be out "soon"):

screenshot from xterm #342

Using the -report-fonts option, I see that it loaded these font-files (treating bold/italic as the "same" as normal, and using a second font for the special characters):

    file=/usr/share/fonts/X11/misc/ter-u18n\_iso-8859-1.pcf.gz              
    file=/usr/share/fonts/X11/misc/ter-u18b\_iso-8859-1.pcf.gz              
    file=/usr/share/fonts/X11/misc/ter-u18n\_iso-8859-1.pcf.gz              
    file=/usr/share/fonts/truetype/dejavu/DejaVuSansMono.ttf

The actual number of fonts depends on what you want to do. In testing the existing range of Unicode values, it may use a couple of dozen fonts.

Related Question