How to Pass Variables to Awk Command with Conditions

awkshell-script

I have a while loop which reads data from a file with city names:

city_name:

COIMBATORE
MADURAI
PONDICHERRY
SALEM
TRIPUR
TIRUCHI
TIRUNELVELI
TUTUCORIN
VELLORE

and using an awk command within the while loop and trying to read the variable defined in while loop body again in the awk statement however, this is not working and giving me errors. I understand that the awk requires the variables to be defined for its body separately to enable awk to understand the variable usage and could read its value.

My while loop is like the following:

while read city
do
        awk -F, '{
                        if ( $1 == "ACTIVE" ) &&
                        ( $2 == "$city" )
                        print $1
                }' siteDBName >> count

        SUM=`awk '{ sum += $1 } END { print sum }' count`

done < city_name

where the siteDBName file contains the multiple columns with data like the following:

siteDBName:

ACTIVE,COIMBATORE,MGT500,1,5,7....
INACTIVE,MADURAI,GT500,5,6,7...
ACTIVE,SALEM,TR600,6,4,6...
ACTIVE,COIMBATORE,GT500,4,5,6...
..
..

Here, i tried using the awk -v c=$city along with rest of the statements but this too gave me errors.

How can i use the variable used and initialized in the while loop inside the awk statement inside this while loop?

Best Answer

You have two basic choices: i) use -v to pass the variable to awk or ii) close the ' around the awk script, use the shell variable and continue the ' again.

  1. Use -v

    while read city
    do
        awk -v city="$city" -F, '{
                                  if ( $1 == "ACTIVE"  && $2 == city ){print $1}
                                 }' siteDBName >> count
    done < city_name
    
  2. Close the quote

    while read city
    do
        awk  -F, '{
                     if ( $1 == "ACTIVE"  && $2 == "'$city'" ){print $1}
                  }' siteDBName >> count
    done < city_name
    

I removed the SUM= line since it wasn't doing anything useful given that $1 is ACTIVE so adding it makes no sense. See my alternate approach below for a way to do this correctly.

Also, note that with this approach, you need to read the siteDBName file multiple times. A more efficient version would be:

$ awk -F, '{
             if(NR==FNR){cities[$1]++;}
             else if($1=="ACTIVE" && $2 in cities ){sum++}
            } END{print sum,"active cities"}' city_name siteDBName 
3 active cities
Related Question