BEGIN and END with the awk command

awk

According to the awk manual, BEGIN and END are not used to match input, but rather to provide start-up and clean-up information to the awk script. Here is the example given:

ls -l | \
awk 'BEGIN { print "Files found:\n" } /\<[a|x].*\.conf$/ { print $9 }'
Files found:
amd.conf
antivir.conf
xcdroast.conf
xinetd.conf

First this prints a string to output. Then it checks input for a pattern match, where the input starts with a or x followed by any character one or many times followed by the .conf. For any matches, the 9th column is printed.

The fact that we are forced to use begin here, does that mean awk can only use at most one print function that does contain a BEGIN or END? If not, then why can't we just use the print function at the beginning without the keyword BEGIN? It seems the BEGIN is superfluous.

Best Answer

The BEGIN isn't superfluous. If you don't specify BEGIN then the print would be executed for every line of input.

Quoting from the manual:

A BEGIN rule is executed once only, before the first input record is read. Likewise, an END rule is executed once only, after all the input is read.

$ seq 5 | awk 'BEGIN{print "Hello"}/4/{print}'   # Hello printed once
Hello
4
$ seq 5 | awk '{print "Hello"}/4/{print}'        # Hello printed for each line of input
Hello
Hello
Hello
Hello
4
Hello
$