Shell – Using AWK to select rows with specific value in specific column

awkcsvlinuxshell

I have a big csv file, which looks like this:

1,2,3,4,5,6,-99
1,2,3,4,5,6,-99
1,2,3,4,5,6,-99
1,2,3,4,5,6,25178
1,2,3,4,5,6,27986
1,2,3,4,5,6,-99

I want to select only the lines in which the 7th columns is equal to -99, so my output be:

1,2,3,4,5,6,-99
1,2,3,4,5,6,-99
1,2,3,4,5,6,-99
1,2,3,4,5,6,-99

I tried the following:

awk -F, '$7 == -99' input.txt > output.txt
awk -F, '{ if ($7 == -99) print $1,$2,$3,$4,$5,$6,$7 }' input.txt > output.txt

But both of them returned an empty output.txt. Can anyone tell me what I'm doing wrong?
Thanks.

Best Answer

The file that you run the script on has DOS line-endings. It may be that it was created on a Windows machine.

Use dos2unix to convert it to a Unix text file.

Alternatively, run it through tr:

tr -d '\r' <input.txt >input-unix.txt

Then use input-unix.txt with your otherwise correct awk code.


To modify the awk code instead of the input file:

awk -F, '$7 == "-99\r"' input.txt >output.txt

This takes the carriage-return at the end of the line into account.

Or,

awk -F, '$7 + 0 == -99' input.txt >output.txt

This forces the 7th column to be interpreted as a number, which "removes" the carriage-return.

Similarly,

awk -F, 'int($7) == -99' input.txt >output.txt

would also remove the \r.

Related Question