I'm not completely sure what your if
is trying to do there. NR
is the number of records; use NF
for the number of fields, if that's what you're aiming for. You can't put {}
blocks in the middle of things like that.
I think what you're aiming for is to compare the value of a field in this line with a field in the previous line, printing out the sum when we reach a new "group" of data. If that's the case, this script will do what you want and I think equates pretty much to what you were aiming for:
{
if (last && $1 != last) {
print last, sum
sum = 0
}
sum = sum + $2
last = $1
}
END {
print last, sum
}
We make a new variable last
to hold the value of the first field ($1
) on the previous line. We'll use that to track which group we're looking at.
- For every line (because we have
{ ... }
at the top level), we first test whether a) last
is set (because we don't want to print anything on the very first line), and b) the value of the first field is different than last
. If it is, we print out the value of last
, a space (because of ,
), and the sum
we've calculated. (If you want a tab, use "\t"
in quotes like you had)
- After printing, we reset
sum
to zero.
- Either way, we add the value of the second field (
$2
) to sum
.
- For every line, we save the first field (our group) into
last
, so we can use it for comparison on the next line.
- Finally, we want to print out the last group as well. For that, we use an
END { ... }
block. It runs right at the end of the program when we run out of data. We print out the sum and the group we're working with just like we did before.
If I run:
awk -f sum.awk < data
with your data file, I get this output:
A 600
B 900
A 2100
as desired.
There are simpler ways to do this, both in awk and otherwise. In particular, we can replace the body above with:
last && $1 != last {
print last, sum
sum = 0
}
{
sum = sum + $2
last = $1
}
Here we use awk's conditional block syntax rather than an explicit if
test: the behaviour of this program is identical to the one above, but it's more idiomatic. It's not hugely different in this example, but it's useful to know about if you're learning awk.
If the file example you gave is literally what it is, with #sum=
lines (or similar), you can use this script:
{
sum = sum + $2
if (NF == 3) {
print $1, sum
sum = 0
}
}
For every line, this adds the value of the second field to the sum
variable. On lines that have exactly three fields (NF == 3
), we print out our total, and reset sum
to zero.
Instead of a temporary file, you could use of your shell's support for
process substitution (this assumes bash
, zsh
or some implementations of ksh
(the feature was introduced by ksh88)):
awk -f ./script.awk <(cat file.txt)
This will provide awk
a file name in place of the <(...)
construct
which, when read, will contain the output of the enclosed command.
Best Answer
I'll join the other advice that you shouldn't parse the output of
ls
, so this is a bad example. But as a more general matter, I would include the awk script directly in the shell script by passing it as an argument toawk
.Note that if the awk script must include the
'
(single quote) character, you need to quote it: use'\''
(close single quote, literal single quote, open single quote).To avoid having to quote, you can use a here document instead. But it's awkward because you can't use standard input both to feed input to awk and to feed the script. You need to use an additional file descriptor (see When would you use an additional file descriptor? File descriptors & shell scripting).
Inside awk, you can read input from another command using the
getline
function and the pipe construct. It's not the way awk is primarily designed to be used, but it can be made to work. You need to quote the file name arguments for the underlying shell, which is highly error-prone. And since the text to be processed doesn't come from the expected sources (standard input or the files named on the command line), you end up with all the code in theBEGIN
block.In short, use a shell for what it's good at (such as piping commands together), and awk for what it's good at (such as text processing).