Join two files on two common fields

awkfilestext processing

I have two files

file1.txt

78Z|033333157|0000001|PERD1|2150421|D|0507020|3333333311
78Z|033333157|0000001|PERD0|2160208|A|1900460|3333333311
78Z|033333157|0000001|RSAB1|2150421|D|0507070|3333333311
78Z|033333157|0000001|RSAB0|2160208|A|1900460|3333333311
78Z|033333157|0000001|ANT37|2141023|D|1245260|3333333311
78Z|033333157|0000001|ANT36|2150422|D|1518490|3333333311
78Z|033333157|0000001|ANT28|2150321|D|0502090|3333333311
78Z|033333157|0000001|ANT27|2150122|D|0501450|3333333311
78Z|033333157|0000001|ANT26|2141222|D|1637460|3333333311
78Z|033333157|0000001|ANT10|2160208|A|1900460|3333333311
78Z|033333157|0000001|ABS10|2151221|D|1223390|3333333311
78Z|696931836|0000001|PERD0|2160203|A|1114450|2222222222
78Z|696931836|0000001|RSAB0|2160203|A|1114450|2222222222
78Z|696931836|0000001|ANT09|2160203|A|1114450|2222222222
78Z|010041586|0000001|PERD0|2160119|A|1835100|3333333333
78Z|010041586|0000001|RSAB0|2160119|A|1835100|3333333333
78Z|010041586|0000001|ANT33|2160119|A|1835100|3333333333
78Z|011512345|0000001|PERD0|2151213|A|1413550|4444444444
78Z|011512345|0000001|RSAB0|2151213|A|1413550|4444444444
78Z|011512345|0000001|ANT32|2160219|A|0319230|4444444444
78Z|011512345|0000001|ANT09|2160218|D|0319230|4444444444
78Z|011512345|0000001|ANT07|2150729|D|1508230|4444444444
78Z|011512345|0000001|ANT06|2141013|D|1208190|4444444444
78Z|011512345|0000001|ABB06|2131224|D|1857030|4444444444
78Z|012344052|0000001|PERD0|2160203|A|1219570|5555555555
78Z|012344052|0000001|ANT50|2160203|A|1219570|5555555555
78Z|099999999|0000001|PERD0|2151214|A|1512460|6666666666
78Z|099999999|0000001|RSAB0|2151214|A|1512460|6666666666
78Z|099999999|0000001|ANT32|2160219|A|0319000|6666666666
78Z|099999999|0000001|ANT09|2160218|D|0319000|6666666666
78Z|099999999|0000001|ABS10|2150615|D|0125350|6666666666

file2.txt

3333333311|ANT10
2222222222|ANT09
5555555555|ANT50
3333333333|ANT33
6666666666|ANT32
4444444444|ANT09

I need a create new file with the lines matched by fourth and eighth column of the file1.txt with second and first column of the file2.txt

The result must be (the order is not important)

file3.txt

78Z|033333157|0000001|ANT10|2160208|A|1900460|3333333311
78Z|696931836|0000001|ANT09|2160203|A|1114450|2222222222
78Z|012344052|0000001|ANT50|2160203|A|1219570|5555555555
78Z|010041586|0000001|ANT33|2160119|A|1835100|3333333333
78Z|099999999|0000001|ANT32|2160219|A|0319000|6666666666
78Z|011512345|0000001|ANT09|2160218|D|0319230|4444444444

Best Answer

awk -F'|' 'NR==FNR{e[$2$1]=1;next};e[$4$8]' file2.txt file1.txt

First read file2 and set array e[field2+field1] then file1 and print if e[field4+field8] is set.

Or turn the fields around:

awk -F'|' 'NR==FNR{e[$1$2]=1;next};e[$8$4]' file2.txt file1.txt
Related Question