You need to remove whitespace characters from the $IFS
parameter for read
to stop skipping leading and trailing ones (with -n1
, the whitespace character if any would be both leading and trailing, so skipped):
while IFS= read -rn1 a; do printf %s "$a"; done
But even then bash's read
will skip newline characters, which you can work around with:
while IFS= read -rn1 a; do printf %s "${a:-$'\n'}"; done
Though you could use IFS= read -d '' -rn1
instead or even better IFS= read -N1
(added in 4.1, copied from ksh93
(added in o
)) which is the command to read one character.
Note that bash's read
can't cope with NUL characters. And ksh93 has the same issues as bash.
With zsh:
while read -ku0 a; do print -rn -- "$a"; done
(zsh can cope with NUL characters).
Note that those read -k/n/N
read a number of characters, not bytes. So for multibyte characters, they may have to read multiple bytes until a full character is read. If the input contains invalid characters, you may end up with a variable that contains a sequence of bytes that doesn't form valid characters and which the shell may end up counting as several characters. For instance in a UTF-8 locale:
$ printf '\375\200\200\200\200ABC' | bash -c '
IFS= read -rN1 a; echo "${#a}"'
6
That \375
would introduce a 6-byte UTF-8 character. However, the 6th one (A
) above is invalid for a UTF-8 character. You still end-up with \375\200\200\200\200A
in $a
, which bash
counts as 6 characters though the first 5 ones are not really characters, just 5 bytes not forming part of any character.
In many computer languages, operators with the same precedence are left-associative. That is, in the absence of grouping structures, leftmost operations are executed first. Bash is no exception to this rule.
This is important because, in Bash, &&
and ||
have the same precedence.
So what happens in your example is that the leftmost operation (||
) is carried out first:
true || echo aaa
Since true
is obviously true, the ||
operator short-circuits and the whole statement is considered true without the need to evaluate echo aaa
as you would expect. Now it remains to do the rightmost operation:
(...) && echo bbb
Since the first operation evaluated to true (i.e. had a 0 exit status), it's as if you're executing
true && echo bbb
so the &&
will not short-circuit, which is why you see bbb
echoed.
You would get the same behavior with
false && echo aaa || echo bbb
Notes based on the comments
- You should note that the left-associativity rule is only followed when both operators have the same precedence. This is not the case when you use these operators in conjunction with keywords such as
[[...]]
or ((...))
or use the -o
and -a
operators as arguments to the test
or [
commands. In such cases, AND (&&
or -a
) takes precedence over OR (||
or -o
). Thanks to Stephane Chazelas' comment for clarifying this point.
It seems that in C and C-like languages &&
has higher precedence than ||
which is probably why you expected your original construct to behave like
true || (echo aaa && echo bbb).
This is not the case with Bash, however, in which both operators have the same precedence, which is why Bash parses your expression using the left-associativity rule. Thanks to Kevin's comment for bringing this up.
There might also be cases where all 3 expressions are evaluated. If the first command returns a non-zero exit status, the ||
won't short circuit and goes on to execute the second command. If the second command returns with a zero exit status, then the &&
won't short-circuit as well and the third command will be executed. Thanks to Ignacio Vazquez-Abrams' comment for bringing this up.
Best Answer
printf '\101'
where101
is a an octal number outputs the byte with that value.When sent to an ASCII terminal, that will be rendered as
A
asA
is character 65 (octal 101) in ASCII and all ASCII-compatible character sets (which includes most modern charsets with the exception of the EBCDIC ones still used on some IBM systems).In
Which should have been written:
as leaving parameter expansions (like
$1
), or command substitution ($(...)
) unquoted is the split+glob operator in Bourne-like shells which is not wanted hereprintf '%03o' "$1"
converts the number in$1
to a 3 digit octalprintf "\\$(...)"
appends that octal to a\
(\\
inside double quotes becomes\
) and passes that toprintf
so it will output the corresponding byte value.Note that it only works in locales where the charset is one byte per character (like
iso8859-1
) or, in locales with a multi-byte charset, only for values 0 to 127.In
bash
,prints the Unicode code-point of character
A
(or at least the value returned bymbtowc()
which on GNU systems at least is the Unicode code-point).Some other implementations (including the standalone GNU
printf
utility) instead return the value of the first byte of the character.For ASCII characters like
A
and on ASCII-based systems, that doesn't make any difference, but for others it matters. For instance the Greekα
character (U+03B1) is encoded as:Bash's
printf '%d\n' "'α"
will always output945
(0x03b1 in hexadecimal), which is the Unicode code point ofα
regardless of the locale (at least on GNU systems), but others may return 225, 206 or 166 depending on the locale.You can see from that those
chr
andord
are only the reverse of each other for ASCII characters (or values 0 to 127), or in locales using theiso8859-1
character set for all characters (values 0 to 255).If
ord()
is meant to return the Unicode code point, then the reverse (print the character corresponding to a Unicode code point) would be:(assuming
bash
4.3 or above (\UXXXXXXXX
was added in 4.2, but didn't work properly for characters U+0080 to U+00FF until 4.3)).Then, in any locale:
Or for
ord()
to return the values of the bytes of the encoding of a given character (in the current locale):And for
chr()
to output those bytes:Then, in a UTF-8 locale for instance:
(your
ord α
would give 945, yourchr
would give garbage for bothchr 945
andchr 206 177
).Or in a locale using
iso8859-7
:(your
ord α
would give 945, though could give 225 ifprintf
was replaced with/usr/bin/printf
if on a GNU system).