Shell – Sort output of awk except for first line

awkpipescriptingshell-scriptsort

This is a use case I am continually running into in parsing CSV files. When it is an inline awk script embedded in a shell script, I can use this workaround:

myfile="$(mktemp)"
awk '(awk script here)' > "$myfile"
head -1 "$myfile"
sed 1d "$myfile" | sort
rm "$myfile"

(Or using appropriate mktemp template for BSD mktemp; GNU works as above.)

However, when writing a full fledged awk script with a shebang #!/bin/awk -f, I don't want to have to change it to a shell script just to handle this one factor of sorting the output.

How can I do this in awk? Or, if there is no native sort function in awk, where can I learn about awk pipelines and how can I use pipelines to accomplish this without changing the shebang?

Best Answer

Here is an example that sorts all lines but the first:

#!/bin/awk -f
BEGIN{cmd="sort"}
NR==1{print;next}
{print $1,$2 | cmd}
END{close(cmd)}

Example

Consider this file:

$ cat file
Letter  Value
A       12
D       10
C       15
B       13

Then:

$ awk -f script.awk file
Letter  Value
A 12
B 13
C 15
D 10

The first input line is the first output line. The remaining lines are sorted.

Related Solutions

Shell Script – Sorting Files into Directories and Subdirectories

If all you are looking to do is produce a nice formatted output of all .md files, tree should do exactly what you want:

tree -P '*.md' /home/user/doc

Add -A for pretty lines, it also does interesting things like output to HTML/XML.

Shell – How to use multiple if statement inside another if statement of a awk program

Nested if statements are effectively statement and statement, so if you do not need to do any particular processing as you step through the nesting, then you can just join them all up with &&.

  awk '{ if( ( $4 == "TRX" || $4 == "TX" ) &&
             ( $10 == "BTS INT UNAFF" || $10 == "LOCAL MODE" ) && 
             ( $12 != "OPER" && $12 != "" ) && # Second Value should not be blank
             ( $22 != "2000") &&
             ( !match( $1, "_") ) ) # Should not contain _ in the value.
           { print } 
       }' FS=, file

To define your Field Separator, you can use option -F, instead of parameter FS=, if you prefer – or have it fully contained within the awk code, in the BEGIN{ } pre-file-processing block: BEGIN{ FS="," }

Best Answer

Example

Related Solutions

Shell Script – Sorting Files into Directories and Subdirectories

Shell – How to use multiple if statement inside another if statement of a awk program

Related Question