Awk + paste for cleaning up PATH

awkpathtcshtext processing

I have seen this code in .cshrc init files on a few machines. I went through a few awk tutorials in trying to understand how it works, but I am still unable to decrypt it.

setenv PATH `echo $PATH | awk 'NF&&\\!x[$0]++' RS='[:|\n]' | paste -sd:`

What does it do?

Best Answer

Doesn't work for me with the backslashes but I can explain this one to you:

echo "$PATH" | awk 'NF && !x[$0]++' RS='[:|\n]'

The record separator (RS) is set to one of the characters ":", "|" and newline. $PATH is usually just one line with elements separated by ":". This makes awk behave like the paths were not separated by ":" but each on its own line.

NF means that empty lines (NF == 0) are ignored. x is an associative array with the paths as subscript. !x[$0]++ means that the "line" is ignored if x[$0] is greater than 0. The result is that every line is output just once. During the first run x[$0] is increased so that in the following runs !x[$0] is false.

This example shows the frequency of all elements after the last line has been processed:

echo "a:b:a:c:a:b" |
  awk 'NF && !x[$0]++;END {for (var in x) print var ": " x[var]}' RS='[:|\n]'
a
b
c
a: 3
b: 2
c: 1
Related Question