Your redirections have a race condition. This:
>(wc -l | awk '{print $1}' > n.txt)
runs in parallel with:
awk 'BEGIN{getline n < "n.txt"}...'
later in the pipeline. Sometimes, n.txt
is still empty when the awk
program starts running.
This is (obliquely) documented in the Bash Reference Manual. In a pipeline:
The output of each command in the pipeline is connected via a pipe to the input of the next command. That is, each command reads the previous command’s output. This connection is performed before any redirections specified by the command.
and then:
Each command in a pipeline is executed in its own subshell
(emphasis added). All the processes in the pipeline are started, with their input and output connected together, without waiting for any of the earlier programs to finish or even start doing anything. Before that, process substitution with >(...)
is:
performed simultaneously with parameter and variable expansion, command substitution, and arithmetic expansion.
What that means is that the subprocess running the wc -l | awk ...
command starts early on, and the redirection empties n.txt
just before that, but the awk
process that causes the error is started shortly after. Both of those commands execute in parallel - you'll have several processes going at once here.
The error occurs when awk
runs its BEGIN
block before the wc
command's output has been written into n.txt
. In that case, the n
variable is empty, and so is zero when used as a number. If the BEGIN
runs after the file is filled in, everything works.
When that happens depends on the operating system scheduler, and which process gets a slot first, which is essentially random from the user perspective. If the final awk
gets to run early, or the wc
pipeline gets scheduled a little later, the file will still be empty when awk
starts doing its work and the whole thing will break. In all likelihood the processes will run on different cores actually simultaneously, and it's down to which one gets to the point of contention first. The effect you'll get is probably of the command working more often than not, but sometimes failing with the error you post.
In general, pipelines are only safe in so far as they're just pipelines - standard output into standard input is fine, but because the processes execute in parallel it's not reliable to rely on the sequencing of any other communication channels, like files, or of any part of any one process executing before or after any part of another unless they're locked together by reading standard input.
The workaround here is probably to do all your file writing in advance of needing them: at the end of a line, it's guaranteed that an entire pipeline and all of its redirections have completed before the next command runs. This command will never be reliable, but if you really do need it to work in this sort of a structure you can insert a delay (sleep
) or loop until n.txt
is non-empty before running the final awk
command to increase the chances of things working how you want.
To select only the SSE flags, try:
awk '/SSE/' ORS=' ' RS=' '
The key thing here is setting the record separators on input and output to a space. That way, each option is accepted or rejected separately.
For example:
$ SUNCC_CXXFLAGS="-D__SSE2__ -D__SSE3__ -D__SSSE3__ -D__SSE4_1__ -D__SSE4_2__ -D__AES__ -D__PCLMUL__ ..."
$ newFLAGS="$(echo "$SUNCC_CXXFLAGS" | awk '/SSE/' ORS=' ' RS=' ')"
$ echo "$newFLAGS"
-D__SSE2__ -D__SSE3__ -D__SSSE3__ -D__SSE4_1__ -D__SSE4_2__
SSE
appears to be a tight enough match here. If it isn't, we can be more specific:
$ newFLAGS="$(echo "$SUNCC_CXXFLAGS" | awk '/^-D__(SSE2|SSE3|SSSE3|SSE4.1|SSE4.2)__/' ORS=' ' RS=' ')"
$ echo "$newFLAGS"
-D__SSE2__ -D__SSE3__ -D__SSSE3__ -D__SSE4_1__ -D__SSE4_2__
Alternative: excluding SSE and AES
$ echo "$SUNCC_CXXFLAGS" | nawk '!/SSE|AES/' ORS=' ' RS=' '
-D__PCLMUL__ ...
Keeping options that match SSE
or sse
$ SUNCC_CXXFLAGS="-D__SSE2__ -D__SSE3__ -D__SSSE3__ -D__SSE4_1__ -D__SSE4_2__ -D__AES__ -D__PCLMUL__ -xarch=sse3"
$ newFLAGS="$(echo "$SUNCC_CXXFLAGS" | awk '/SSE|sse/' ORS=' ' RS=' ')"
$ echo "$newFLAGS"
-D__SSE2__ -D__SSE3__ -D__SSSE3__ -D__SSE4_1__ -D__SSE4_2__ -xarch=sse3
The change here is that we replaced the regex /SSE/
with /SSE|sse/
. Because the vertical bar, |
, means logical-or, this matches either SSE
or sse
.
Best Answer
You can not get the error number using
getline
. In your command, the output is fromls
, notprint result
.In form
cmd | getline result
,cmd
is run, then its output is piped togetline
. It returns1
if got output,0
if EOF,-1
on failure. The problem is that failure is from runninggetline
itself, not the return code ofcmd
. Example:You will see that
/etc/shadow
can not be read, sogetline
fails to run and reports the error inERRNO
variable.Note that GNU awk will return the
cmd
status if not in posix mode, so you can do:In POSIX mode, You won't get the exit status: