I have 2 data files each containing one column.
I want to make another data file by merging both the columns.
I have the command line in shell but I don't know how it works.
Please explain elaborately the below command:
awk 'NR==FNR {a[i++]=$0};
{b[x++]=$0;};{k=x-i};
END {for(j=0;j<i;) print a[j++],b[k++]}' \
file1.txt file2.txt
Example:
input:
file1.txt
11
23
19
31
67
file2.txt
13
19
25
67
93
I used the command above to write a shell script and got following output:
11 13
23 19
19 25
31 67
67 93
I want to know how this command line is working on this example to give the output?
Best Answer
Well, part of learning to use Unix is to figure out what existing scripts are doing. In this case you need to know a bit about how
awk
works to understand the code. I will focus on describing theawk
part, this should get you started in figuring out the rest.Basically
awk
is a pattern-driven scripting language, where commands consist of both a (search) pattern/condition and a corresponding code block. During execution, any input files are read line by line and if the pattern/condition is true for a line, the code block is executed. There are special patternsBEGIN
andEND
which are used to trigger code to get executed before the first line or after the last line is read.In your example you have three pattern/code lines:
NR
andFNR
are two special variables set byawk
. You can look up their meaning withman awk
to see thatso basically this condition is true while lines from the first line are read (which means that
a[i++]=$0
is executed once for each line from the first file) and false for all additional files.$0
is the current line of input.This code block has no condition/pattern so it gets executed for every line read (from all files including the first one).
This part runs after the last line of the last file has been read and processed.
With these basics you should be able to figure out the meaning of the different code blocks and variables yourself.