How to remove symbols from a column using awk

awksedtext processing

I have data like this:

chr1    134901  139379  -   "ENSG00000237683.5";
chr1    860260  879955  +   "ENSG00000187634.6";
chr1    861264  866445  -   "ENSG00000268179.1";
chr1    879584  894689  -   "ENSG00000188976.6";
chr1    895967  901095  +   "ENSG00000187961.9";

I generated by parsing a GTF file

I want to remove the "'s and ;'s from column 5 using awk or sed if it possible. The result would look like this:

chr1    134901  139379  -   ENSG00000237683.5
chr1    860260  879955  +   ENSG00000187634.6
chr1    861264  866445  -   ENSG00000268179.1
chr1    879584  894689  -   ENSG00000188976.6
chr1    895967  901095  +   ENSG00000187961.9

Best Answer

Using gsub:

awk '{gsub(/\"|\;/,"")}1' file
chr1    134901  139379  -   ENSG00000237683.5
chr1    860260  879955  +   ENSG00000187634.6
chr1    861264  866445  -   ENSG00000268179.1
chr1    879584  894689  -   ENSG00000188976.6
chr1    895967  901095  +   ENSG00000187961.9

If you want to operate only on the fifth field and preserve any quotes or semicolons in other fields:

awk '{gsub(/\"|\;/,"",$5)}1' file 
Related Question