Can I use read
to capture the \n
\012
or newline character?
Define test function:
f() { read -rd '' -n1 -p "Enter a character: " char &&
printf "\nYou entered: %q\n" "$char"; }
Run the function, press Enter:
$ f;
Enter a character:
You entered: ''
Hmmm. It's a null string.
How do I get my expected output:
$ f;
Enter a character:
You entered: $'\012'
$
I want the same method to be able to capture ^D
or \004
.
If read
can't do it, what is the work around?
Best Answer
To read 1 character, use
-N
instead, which reads one character always and doesn't do$IFS
processing:read -rn1
reads one record up to one character and still does$IFS
processing (and newline is in the default value of$IFS
which explains why you get an empty result even though you read NUL-delimited records). You'd use it instead to limit the length of the record you read.In your case, with NUL-delimited records (with
-d ''
),IFS= read -d '' -rn1 var
would work the same asbash
cannot store a NUL character in its variables anyway, soprintf '\0' | read -rN1 var
would leave$var
empty and return a non-zero exit status.To be able to read arbitrary characters including NUL, you'd use the
zsh
shell instead where the syntax is:(no need for
-r
orIFS=
there. However note thatread -k
reads from the terminal (k
is for key; zsh's-k
option predates bash's and even ksh93's-N
by decades). To read from stdin, useread -u0 -k1
).Example (here pressing Ctrl+Space to enter a NUL character):
Note that to be able to read a character,
read
may have to read more than one byte. If the input starts with the first byte of multi-byte character, it will read at least one more byte, so you could end up with$var
containing something that the shell considers having a length greater than 1 if the input contains byte sequences not forming valid characters.For instance in a UTF-8 locale:
In UTF-8, 0xFC is the first byte of a 6-byte long character, the 5 other ones meant to have the 8th bit set and the 7th bit unset, however we provide only 4.
read
still reads that extraX
to try and find the end of the character which ends up in$var
along with those 5 bytes that don't form a valid character and end up being counted as one character each.