How does this AWK script work

command lineunix

I have 2 data files each containing one column.
I want to make another data file by merging both the columns.
I have the command line in shell but I don't know how it works.

Please explain elaborately the below command:

awk 'NR==FNR {a[i++]=$0};
             {b[x++]=$0;};{k=x-i};
     END     {for(j=0;j<i;) print a[j++],b[k++]}' \
  file1.txt file2.txt

Example:

input:

file1.txt   
11
23
19
31
67
file2.txt
13
19
25
67
93

I used the command above to write a shell script and got following output:

I want to know how this command line is working on this example to give the output?

Best Answer

Well, part of learning to use Unix is to figure out what existing scripts are doing. In this case you need to know a bit about how awk works to understand the code. I will focus on describing the awk part, this should get you started in figuring out the rest.

Basically awk is a pattern-driven scripting language, where commands consist of both a (search) pattern/condition and a corresponding code block. During execution, any input files are read line by line and if the pattern/condition is true for a line, the code block is executed. There are special patterns BEGIN and END which are used to trigger code to get executed before the first line or after the last line is read.

In your example you have three pattern/code lines:

NR==FNR {a[i++]=$0};

NR and FNR are two special variables set by awk. You can look up their meaning with man awk to see that

NR     ordinal number of the current record
FNR    ordinal number of the current record in the current file

so basically this condition is true while lines from the first line are read (which means that a[i++]=$0 is executed once for each line from the first file) and false for all additional files. $0 is the current line of input.

        {b[x++]=$0;};{k=x-i};

This code block has no condition/pattern so it gets executed for every line read (from all files including the first one).

END     {for(j=0;j<i;) print a[j++],b[k++]}'

This part runs after the last line of the last file has been read and processed.

With these basics you should be able to figure out the meaning of the different code blocks and variables yourself.

Related Solutions

Where do these Terminal commands come from

Well, these commands edit .plist files in your /Users/xyz/Library/Preferences folder. So, if you look at these files, you can look at these files and base your Terminal commands after these.

These are XML-based files, so you can view them in a more readable format if you open them in Xcode, which is available in the Mac App Store.

In the Terminal, the format for these is "defaults write PLIST_FILE KEY -TYPE VALUE" where the PLIST_FILE is the name of the file in the Preferences folder without the .plist, KEY is the key that you can see in Xcode, -TYPE is the type of key that you can find Xcode (and you can find if there is an abbreviation for that type by typing "defaults" into the terminal), and value is the value you wish to set the key at.

Here are some of the string types from the Terminal:

-string <string_value>
-data <hex_digits>
-int[eger] <integer_value>
-float  <floating-point_value>
-bool[ean] (true | false | yes | no)
-date <date_rep>
-array <value1> <value2> ...
-array-add <value1> <value2> ...
-dict <key1> <value1> <key2> <value2> ...
-dict-add <key1> <value1> ...

Incorrect Terminal.app output for long lines with tabs

This has to do with line breaks. Terminal is looking for either a space, or a continuos string of characters to make a line break. In this case the first opportunity it sees to break the line is between the y and the o. The "y" is shown because the last character in an extended line is shown in the last place to hint the user that something is happening there.

Workarounds...

You could try:

echo -e "a \tb \tc \td \te \tf \tg \tyo"

and everything will appear as you'd expect it.

also something like

echo -e "ab\tcd\tef\tgh\tij\tkl\tmn\tyo"

should break between the "k" and "l"

Still this is odd behavior and defiantly worth a bug/radar report

Best Answer

Related Solutions

Where do these Terminal commands come from

Incorrect Terminal.app output for long lines with tabs

Related Question