Sort and Uniq in Awk – How to Use

awksortuniq

I know there are "sort" and "uniq" out there, however, today's question is about how to utilise AWK to do that kind of a job. Say if I have a list of anything really (ips, names, or numbers) and I want to sort them;

Here is an example I am taking the IP numbers from a mail log:

awk 'match($0,/\[[[:digit:]]+\.[[:digit:]]+\.[[:digit:]]+\.[[:digit:]]+\]/) { if ( NF == 8 && $6 == "connect" ) {print substr($0, RSTART+1,RLENGTH-2)} }' maillog

Is it possible to sort them, ips, "on the go" within the same awk command? I do not require a complete answer to my question but some hints where to start.

Cheers!

Best Answer

To sort you can use a pipe also inside of an awk command, as in:

awk '{ print ... | "sort ..." }'

The syntax means that all respective lines of the data file will be passed to the same instance of sort.

Of course you can also do that equivalently on shell level:

awk '{ print ... }' | sort ...

Or you can use GNU awk which has a couple sort functions natively defined.

The uniq is in awk typically accomplished by saving the "unique data element or key" in an associative array and checking whether new data need to be memorized. One example to illustrate:

awk '!a[$0]++'

This means: If the current line is not in the array then the condition is true and the default action to print the line triggered. Subsequent lines with the same data will result in a false condition and the data will not be printed.

Related Question