The unit separator ASCII character (ASCII 31, octal 37), is visible in Vim as a ^_
. But if I print the same file to the terminal, the character is invisible. This causes the fields on a line to get stuck together:
# In Vim and less:
first field^_second field^_last field
# cat the same file to terminal:
cat delim.txt
first fieldsecond fieldlast field
# print 2nd field with awk
cat delim.txt | awk 'BEGIN {FS = "\037"} {print $2}'
second field
I suppose I can make the unit separator visible with cat -v:
cat -v delim.txt
first field^_second field^_last field
But this is rather cumbersome. Why doesn't the unit separator have a visible representation when printed to stdout in the Bash shell? I can't even copy and paste the shell output correctly; the unit separator gets lost in the process.
Best Answer
The unit separator (
US
) character, also known asIS1
, is in thecntrl
character class and is not in theprint
character class. It is a control character that is intended for organizing text into groups, for programs that are designed to make use of that information. In general, non-printable characters are probably going to be interpreted and rendered differently in different programs or environments.The reason you are seeing it represented as
^_
in Vim is because Vim is an interactive editor. It can freely render non-printable characters however it wants, as long as the correct binary character is written to disk.You cannot get the same behavior in the shell because Unix shell programs are written to operate on and pass plain text to each other. When you
cat
a file, the text that is written to the terminal must be what is actually in the file.So that leaves it to the terminal device to interpret the character. And it turns out that some terminal emulators do render the
US
character differently from others. Ingnome-terminal
(or anyvte
-based terminal), the character will be rendered as a box containing the hex code001F
. Inxterm
orrxvt
, the character is indeed invisible.