Ubuntu – How to print the last 5 fields in awk

awkbashcommand linetext processing

I have 10 fields and I want to start from field 5 to field 10 and ignore the first 5 fields. How can I use NF in awk to do that?

f1 f2 f3 f4 f5 f6 f7 f8 f9 f10
c1 c2 c3 c4 c5 c6 c7 c8 c9 c10

I want to show only:

f6 f7 f8 f9 f10
c6 c7 c8 c9 c10

Best Answer

You need to loop through the fields:

bash-4.3$ awk '{for(i=6;i<=NF;i++) printf $i" "; print ""}' input_file.txt 
f6 f7 f8 f9 f10 
c6 c7 c8 c9 c10

Or you can make fields equal to Null string:

bash-4.3$ awk '{for(i=1;i<=5;i++) $i="";print}' input_file.txt 
     f6 f7 f8 f9 f10
     c6 c7 c8 c9 c10

Or use substring of the whole line , to print all characters from where field 6 begins (credit to https://stackoverflow.com/a/12900372/3701431):

bash-4.3$ awk '{print substr($0,index($0,$6))}' input_file.txt 
f6 f7 f8 f9 f10
c6 c7 c8 c9 c10

or simply use cut command:

bash-4.3$ cut -d " " -f6-10  input_file.txt 
f6 f7 f8 f9 f10
c6 c7 c8 c9 c10

Python can do that too:

bash-4.3$ python -c 'import sys;fields=[" ".join(line.strip().split()[5:]) for line in sys.stdin];print "\n".join(fields)' < input_file.txt 
f6 f7 f8 f9 f10
c6 c7 c8 c9 c10

or alternatively:

$ python -c "import sys;print '\n'.join(map(lambda x:' '.join(x.split()[5:]),sys.stdin.readlines()))" < input_file.txt
f6 f7 f8 f9 f10
c6 c7 c8 c9 c10

Or with Ruby:

bash-4.3$ ruby -ne 'print $_.split()[5..10].join(" ");print "\n"' < input_file.txt 
f6 f7 f8 f9 f10
c6 c7 c8 c9 c10

Bash + xargs can do it too, although a bit more convoluted:

bash-4.3$ cat input_file.txt | xargs -L 1 bash -c 'arr=($@);for i in $(seq 5 10);do printf "%s " ${arr[$i]} ; done; echo' sh
f6 f7 f8 f9 f10  
c6 c7 c8 c9 c10

Related Solutions

Ubuntu – Awk command to print all the lines except the last three lines

It's ever-so clunky but you can add every line to an array and at the end —when you know the length— output everything but the last 3 lines.

... | awk '{l[NR] = $0} END {for (i=1; i<=NR-3; i++) print l[i]}'

Another (more efficient here) approach is manually stacking in three variables:

... | awk '{if (a) print a; a=b; b=c; c=$0}'

a only prints after a line has moved from c to b and then into a so this limits it to three lines. The immediate upsides are it doesn't store all the content in memory and it shouldn't cause buffering issues (fflush() after printing if it does) but the downside here is it's not simple to scale this up. If you want to skip the last 100 lines, you need 100 variables and 100 variable juggles.

If awk had push and pop operators for arrays, it would be easier.

Or we could pre-calculate the number of lines and how far we actually want to go with $(($(wc -l < file) - 3)). This is relatively useless for streamed content but on a file, works pretty well:

awk -v n=$(($(wc -l < file) - 3)) 'NR<n' file

Typically speaking you'd just use head though:

$ seq 6 | head -n-3
1
2
3

Using terdon's benchmark we can actually see how these compare. I thought I'd offer a full comparison though:

head: 0.018s (me)
awk + wc: 0.169s (me)
awk 3 variables: 0.178s (me)
awk double-file: 0.322s (terdon)
awk circular buffer: 0.355s (Scrutinizer)
awk for-loop: 0.693s (me)

The fastest solution is using a C-optimised utility like head or wc handle the heavy lifting things but in pure awk, the manually rotating stack is king for now.

Ubuntu – How to use double substitution in awk

All what you need is the power of awk and a for Statement:

paste <(awk -F, '{ for (i=29;i<=188; i++) print $i }' PreRefFile.csv) <(awk -F, '{ for (i= 29;i<= 188;i++) print $i }' Txlog.csv)

My test case:

paste <(awk -F, '{ for (i=2;i<=3;i++) print $i }' foo1) <(awk -F, '{ for (i=2;i<=3;i++) print $i }' foo2)

File foo1:

1,2,3,4,5,6
7,8,9,10,11,12

File foo2:

a,b,c,d,e,f,g
A,B,C,D,E,F,G

Output:

Best Answer

Related Solutions

Ubuntu – Awk command to print all the lines except the last three lines

Ubuntu – How to use double substitution in awk

Related Question