When trying to format printf
output involving strings containing multi-byte characters, it became clear that printf
does not count literal characters but the number of bytes, which makes formatting text difficult if single-byte and multi-byte characters are mixed. For example:
$ cat script
#!/bin/bash
declare -a a b
a+=("0")
a+=("00")
a+=("000")
a+=("0000")
a+=("00000")
b+=("0")
b+=("├─00")
b+=("├─000")
b+=("├─0000")
b+=("└─00000")
printf "%-15s|\n" "${a[@]}" "${b[@]}"
$ ./script
0 |
00 |
000 |
0000 |
00000 |
0 |
├─00 |
├─000 |
├─0000 |
└─00000 |
I found various suggested work-arounds (mainly wrappers using another language or utility to print the text). Are there any native bash solutions? None of the documented printf
format strings appear to help. Would the locale
settings be relevant in this situation, e.g., to use a fixed-width character encoding like UTF-32?
Best Answer
You could work around it by telling the terminal to move the cursor to the desired position, instead of having
printf
count the characters.:Well, assuming you're printing to a terminal, that is...
The control sequence there is
<ESC>[nnG
wherenn
is the column to move to, in decimal.Of course, if the first column is longer than the allocated space, the result isn't too nice:
To work around that, you could explicitly clear the rest of the line (
<ESC>[K
) before printing the following column.Another way would be to do the padding manually, assuming we have something that can determine the length of the string in characters. This seems to work in Bash for simple characters, but is of course a bit ugly. Zero-width and double width characters will probably break it, and I didn't test combining characters either.
And the output is: