Corrupted gnu-screen session not displaying UTF-8 correctly

emacsgnu-screenterminal

(Edited to clarify the role of Emacs in the problem with the display.)

My current gnu-screen session has gotten corrupted somehow, and Emacs fails to display UTF-8 characters properly.

I've confirmed that in freshly-started gnu-screen processes, Emacs displays UTF-8 characters properly, but at the moment it would be very disruptive to replace the corrupted gnu-screen session with a new one, and instead

I'm looking for ways to further troubleshoot the problem with this corrupted gnu-screen session, and hopefully fix it.


FWIW, I give more background below, including a description of what I've done so far to diagnose the problem.

I started this gnu-screen session several days ago at my OS X workstation at work with

% screen -U

…(as I always do). Since then I have re-attached this session from several machines (possibly after first ssh-ing to my workstation at work) using

% screen -U -dR

(again, this is what I always do). I did precisely this this morning at my workstation at work (the machine where the gnu-screen process is actually running).

Today, for the first time since I created this gnu-screen session, I needed to work with files that contain a lot of non-ascii UTF-8 characters. It was then that I discovered that this gnu-screen session must have gotten corrupted somehow, because it displays all these characters as ?, resulting in an unusable display.

(As I already alluded to, these UTF-8-rich files are displayed correctly by freshly-started gnu-screen sessions, so I'm pretty sure that the display problem is with the particular gnu-screen session that here I'm calling "corrupted". Also, I confirmed that the "??? display" shows up in every terminal that I have attached the gnu-screen session from, so the problem is not with the terminal program hosting the gnu-screen session. Lastly, I also confirmed that the problem is not with one particular Emacs session: in the corrupted gnu-screen session, every new Emacs sessions displays the UTF-8 characters as ?, which argues against the problem being with a particular Emacs session.)

I've confirmed that utf8 is on by running

:utf8 on on

The output of :info is

(1,5)/(210,52)+10000 +(-)flow app log UTF-8 0(zsh)

And, FWIW:

% /usr/local/bin/screen --version
Screen version 4.00.03 (FAU) 23-Oct-06

Also, I should point out that new

What else can I do to troubleshoot this problem?


UPDATE: Drav Sloan and Stephane Chazelas both asked about my locale settings:

% locale
LANG=
LC_COLLATE="C"
LC_CTYPE="C"
LC_MESSAGES="C"
LC_MONETARY="C"
LC_NUMERIC="C"
LC_TIME="C"
LC_ALL=

Currently, for OS X I don't set any locale-related variables.

On Linux systems, my .zshenv does set

export LANG=en_US.utf8
export LC_ALL=en_US.utf8

…but if I put the same lines in my .zshenv on Darwin, I get error messages to the effect that "setting locale failed." I vaguely remember bashing my skull for several hours over the problem of finding the right locale settings for Darwin/Lion. It may have been that "setting nothing" emerged as the "least awful" solution to the problem and, after all, at least fresh gnu-screen sessions do display UTF-8 characters correctly, even in the absence of an explicit locale setting. But clearly I need to figure out how to properly set locale in Darwin/Lion…

UPDATE2: OK, I think I figured out the reason for the errors I mentioned above: in Darwin/Lion, the string en_US.utf8 is invalid; instead it should be en_US.UTF-8.

Best Answer

I had the same issue when running:

git clone https://github.com/jwiegley/git-scripts.git
cd git-scripts
perl git-forest 

I used this as my test. Basically, you should get nice lines is you have utf8 set properly. If not you will get ugly boxes or characters.

The solution is that you need to set LC_ALL to en_US.UTF-8 BEFORE you start a new screen session. I tired doing it after creating the screen session and had no luck.

Here are the step i followed to get this going:

1) Run locale to view the current setup. I got this (which explains why i was having issues:

LANG=en_US.UTF-8
LC_CTYPE="C"
LC_NUMERIC="C"
LC_TIME="C"
LC_COLLATE="C"
LC_MONETARY="C"
LC_MESSAGES="C"
LC_PAPER="C"
LC_NAME="C"
LC_ADDRESS="C"
LC_TELEPHONE="C"
LC_MEASUREMENT="C"
LC_IDENTIFICATION="C"
LC_ALL=C

2) Before creating a new screen session, you need to redefine LC_ALL to en_US.UTF-8

if you are using csh shell

setenv LC_ALL en_US.UTF-8

if you are using bash shell

export LC_ALL="en_US.UTF-8"

3) Verify LC_ALL was set properly, by running locale again:

LANG=en_US.UTF-8
LC_CTYPE="en_US.UTF-8"
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_COLLATE="en_US.UTF-8"
LC_MONETARY="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_PAPER="en_US.UTF-8"
LC_NAME="en_US.UTF-8"
LC_ADDRESS="en_US.UTF-8"
LC_TELEPHONE="en_US.UTF-8"
LC_MEASUREMENT="en_US.UTF-8"
LC_IDENTIFICATION="en_US.UTF-8"
LC_ALL=en_US.UTF-8

4) Now run a new screen session and run the git-forest test and you should see nice lines

Related Question