Find only the matched pattern in a CSV file

awkperlsedtext processing

I am trying to print only the matched pattern in a CSV file. Example: all the columns value starting with 35=its value. Thanks.

CSV file:

35=A,D=35,C=129,ff=136
D=35,35=BCD,C=129,ff=136
900035=G,D=35,C=129,ff=136
35=EF,D=35,C=129,ff=136,35=G
36=o,D=35,k=1

Output:

35=A
35=BCD
35=EF
35=G

The command I used did not work:

sed -n '/35=[A-Z]*?/ s/.*\(35=[A-Z]*?\).*/\1/p' filename

Best Answer

With GNU grep which supports -o option to print only matched string, each on its own line

$ grep -oE '\b35=[^,]+' ip.csv 
35=A
35=BCD
35=EF
35=G
  • \b is word boundary, so that 900035 won't match
  • [^,]+ to match one or more non, characters
  • assumes the values do not contain ,


With awk

$ awk -F, '{ for(i=1;i<=NF;i++){if($i~/^35=/) print $i} }' ip.csv 
35=A
35=BCD
35=EF
35=G
  • -F, set , as input field separator
  • for(i=1;i<=NF;i++) iterate over all fields
  • if($i~/^35=/) if field starts with 35=
    • print $i print that field

Similar with perl

perl -F, -lane 'foreach (@F){print if /^35=/}' ip.csv 
Related Question