Linux – How to remove line if it contains a character exactly once

awklinuxsedtext processing

I want to remove a line from a file which contains a particular character only once, if it is present more than once or is not present then keep the line in file.

For example:

DTHGTY
FGTHDC
HYTRHD
HTCCYD
JUTDYC

Here, the character which I want to remove is C so, the command should remove lines FGTHDC and JUTDYC because they have C exactly once.

How can I do this using either sed or awk?

Best Answer

In awk you can set the field separator to anything. If you set it to C, then you'll have as many fields +1 as occurrences of C.

So if you say awk -F'C' '{print NF}' <<< "C1C2C3" you get 4: CCC consists in 3 Cs, and hence 4 fields.

You want to remove lines in which C occurs exactly once. Taking this into consideration, in your case you will want to remove those lines in which there are exactly two C-fields. So just skip them:

$ awk -F'C' 'NF!=2' file
DTHGTY
HYTRHD
HTCCYD

Related Solutions

Text Processing – How to Remove Lines with Specific Line Numbers

You can use awk as well:

awk 'NR==FNR { nums[$0]; next } !(FNR in nums)' linenum infile

in specific case when 'linenum' file might empty, awk will skip it, so it won't print whole 'infile' lines then, to fix that, use below command:

awk 'NR==FNR && FILENAME==ARGV[1]{ nums[$0]; next } !(FNR in nums)' linenum infile

or even better (thanks to Stéphane Chazelas):

awk '!firstfile_proceed { nums[$0]; next } 
     !(FNR in nums)' linenum firstfile_proceed=1 infile

Best Answer

Related Solutions

Text Processing – How to Remove Lines with Specific Line Numbers

Related Question