Bash – Does Bash or AWK have IN operator like R programming Language

awkbashrshell

In R, We have IN operator to check whether or not the element is present in the specific column.

For example: If we have fruits and market dataframe with fruit_name and products as the column name respectively. And, say, we have to check what fruits are present in the market.

In R,

available_fruit <- fruits$fruit_name %in% market$products

Is there any operator in bash or AWK which does similar action like %in% in R?

Best Answer

awk has an in operator. It may be used to access the indexes in an array (arrays are associative arrays/hashes in awk).

If the names of the fruits are keys in the array market then you may use

if (fruit_name in market) { ... }

to check whether the string in fruit_name is a key in market.

For example

BEGIN { FS = "\t" }

NR == FNR { market[$1] = $2; next }

!($1 in market) { printf("No %s in the market\n", $1 ); next }

{ sum += market[$1] }

END { printf("Total sum is %.2f\n", sum ) }

Running this on two files:

$ awk -f script.awk market_prices mylist

where market_prices is a two-column tab delimited file with items and prices, and mylist is a list of items. The script would read the items and their prices from the first file and populate market with these, and then calculate the total cost of the items in the second file, if they existed in the market, reporting the items that can't be found.

The in operator may also be used to loop over the indexes of an array:

for (i in array) {
    print i, array[i]
}

The ordering of the indexes may not be sorted.

Related Question