I have a data file that I want to normalize using awk
, based on the last datapoint. Therefor, I would like to access the last data point first, to normalize the data, then process normally.
The following method, using tac
twice, does the job, but, is maybe more complicated than necessary.
$ cat file
0 5
1 2
2 3
3 4
$ tac file | awk 'NR==1{norm=$2} {print $1, $2/norm}' | tac
0 1.25
1 0.5
2 0.75
3 1
My question is the following: Is it possible to obtain the above result by using awk only?
I think the answer is "No, awk scans the file line by line", but I am open for suggestions for alternatives.
Best Answer
You can do it as a two-pass solution in awk:
If your version of awk supports the ENDFILE block (e.g. GNU awk 4+), you can do it like this:
Note that it is more efficient to
seek
to the end of the file first see camh's answer.Explanation
The first example works by remembering the previous
$2
, i.e. it is only evaluated when the local line counter (FNR
) is equal to the global line counter (NR
). Thenext
command skips to the next line, in this case it ensures that the last block is only evaluated when the second argument is parsed.The second example has similar logic, but takes advantage of the the ENDFILE block which is evaluated when the end of an input-file is reached.