Bash – Using “ifne” in a pipeline – running multiple commands

bashpipeshell-script

In my script, I am using the ifne utility from the moreutils package. The line can be simplified to the following:

printf "asdf\n" | ifne cat - && echo "stream not empty"

ifne executes only if the stream is non-empty. But how can I make the second command (echo "stream not empty") also execute only if the stream is non-empty? For example, how can change the following command so that it doesn't print "stream not empty"?

printf "" | ifne cat - && echo "stream not empty"

Surrounding cat and echo with parentheses generates a syntax error:

printf "" | ifne (cat - && echo "stream not empty")

How can I execute the last command only if stream is non-empty?

Best Answer

ifne doesn't set an exit code based on whether the input is empty or not, so && and || aren't going to work as hoped. An alternate approach to Babyy's answer is to use pee from the same package:

printf "asdf\n" | pee 'ifne cat -' 'ifne echo "stream not empty"'

This works like tee, but duplicates the input stream into a number of pipes, treating each argument as a command to run. (tpipe is a similar command, but behaves slightly differently.)

A possible issue though is that each of the commands may be writing to stdout in parallel, depending on buffering and length of input/output there is a chance that output will be interleaved, or vary from run to run (effectively a race). This can probably be eliminated using sponge (same package) instead of cat, and/or other buffering/unbuffering solutions. It affects the example you gave, but may not affect your real use-case.

Related Solutions

Bash – eval limitation with piped commands

The problem here is actually an issue with the bash parser. There is no workaround other than editing and recompiling bash, and the 3333 limit is likely to be the same on all platforms.

The bash parser is generated with yacc (or, typically, with bison but in yacc mode). yacc parsers are bottom-up parsers, using the LALR(1) algorithm which builds a finite state machine with a pushdown stack. Loosely speaking, the stack contains all not-yet-reduced symbols, along with enough information to decide which productions to use to reduce the symbols.

Such parsers are optimized for left-recursive grammar rules. In the context of an expression grammar, a left-recursive rule applies to a left-associative operator, such as a−b in ordinary mathematics. That's left associative because the expression a−b−c groups ("associates") to the left, making it equal to (a−b)−c rather than a−(b−c). By contrast, exponentiation is right-associative, so that a^{b^c} is by convention evaluated as a^{(b^c)} rather than (a^b)^{^c}.

bash operators are process operators, rather than arithmetic operators; these include short-circuit booleans (&& and ||) and pipes (| and |&), as well as sequencing operators ; and &. Like mathematical operators, most of these associate to the left, but the pipe operators associate to the right, so that cmd1 | cmd2 | cmd3 is parsed as though it were cmd1 | { cmd2 | cmd3 ; } as opposed to { cmd1 | cmd2 ; } | cmd3. (Most of the time the difference is not important, but it is observable. [See Note 1])

To parse an expression which is a sequence of left associative operators, you only need a small parser stack. Every time you hit an operator, you can reduce (parenthesize, if you like) the expression to the left of it. By contrast, parsing an expression which is a sequence of right associative operators requires that you put all of the symbols onto the parser stack until you reach the end of the expression, because only then can you start reducing (inserting parentheses). (That explanation involves quite a bit of hand-waving, since it was intended to be non-technical, but it is based on the working of the real algorithm.)

Yacc parsers will resize their parser stack at runtime, but there is a compile-time maximum stack size, which by default is 10000 slots. If the stack reaches the maximum size, any attempt to expand it will trigger an out-of-memory error. Because | is right associative, an expression of the form:

statement | statement | ... | statement

will eventually trigger this error. If it were parsed in the obvious way, that would happen after 5,000 pipe symbols (with 5,000 statements). But because of the way the bash parser handles newlines, the actual grammar used is (roughly):

pipeline: command '|' optional_newlines pipeline

with the consequence that there is an optional_newlines grammar symbol after every |, so each pipe occupies three stack slots. Hence, the out-of-memory error is generated after 3,333 pipe symbols.

The yacc parser detects and signals the stack overflow, which it signals by calling yyerror("memory exhausted"). However, the bash implementation of yyerror tosses away the provided error message, and substitutes a message like "syntax error detected near unexpected token...". That's a bit confusing in this case.

Notes

The difference in associativity is most easily observed using the |& operator, which pipes both stderr and stdout. (Or, more accurately, duplicates stdout into stderr after establishing the pipe.) For a simple example, suppose that the file foo does not exist in the current directory. Then

# There is a race condition in this example. But it's not relevant.
$ ls foo | ls foo |& tr n-za-m a-z
ls: cannot access foo: No such file or directory
yf: pnaabg npprff sbb: Nb fhpu svyr be qverpgbel
# Associated to the left:
$ { ls foo | ls foo ; } |& tr n-za-m a-z
yf: pnaabg npprff sbb: Nb fhpu svyr be qverpgbel
yf: pnaabg npprff sbb: Nb fhpu svyr be qverpgbel
# Associated to the right:
$ ls foo | { ls foo |& tr n-za-m a-z ; }
ls: cannot access foo: No such file or directory
yf: pnaabg npprff sbb: Nb fhpu svyr be qverpgbel

Bash – Why does “bash -x” break this script

The problem is this line:

TIMESEC=$(echo blah | ( /usr/bin/time -f %e grep blah >/dev/null ) 2>&1 | awk -F. '{print $1}')

where you are redirecting the standard error to match the standard output. bash is writing its trace-messages to the standard error, and is (for instance) using its built-in echo along with other shell constructs all in the bash process.

If you change it to something like

TIMESEC=$(echo blah | sh -c "( /usr/bin/time -f %e grep blah >/dev/null )" 2>&1 | awk -F. '{print $1}')

it will work around that problem, and perhaps be an acceptable compromise between tracing and working:

++ awk -F. '{print $1}'
++ sh -c '( /usr/bin/time -f %e grep blah >/dev/null )'
++ echo blah
+ TIMESEC=0                 
+ echo ABC--0--DEF
ABC--0--DEF
+ '[' 0 -eq 0 ']'
+ echo 'we are here!'
we are here!

Best Answer

Related Solutions

Bash – eval limitation with piped commands

Notes

Bash – Why does “bash -x” break this script

Related Question