I have two files, the first one (tab-delimited) looks like:
1 100 371 R1,R2,R4 12
5 167 16 R2,R5 5
8 242 490 R1,R3,R4 11
another looks like:
R1 0.167
R2 0.171
R3 0.156
R4 0.162
R5 0.159
I want to add one more field in the first file, the values in this field are the sum of the matching values of R1, R2,…,R5 in the second file then divided by the values in the fourth field.
For example, the first line has R1, R2, R4, so the value I want is (0.167+0.171+0.162)/12 = 0.0416667
Expected output:
1 100 371 R1,R2,R4 12 0.0416667
5 167 16 R2,R5 5 0.066
8 242 490 R1,R3,R4 11 0.0440909
How to write the awk command?
Best Answer
awk solution:
The output:
a[$1]=$2
- capturing keys/values from the 2nd filesplit($4,b,",")
- splitting the 4th field of the 1st file into array of "keys"len
- number of chunkss+=a[b[i]]
- accumulating values for the matched "keys"