Ubuntu – Delete lines that come after a line with a specific pattern in Shell

bashcommand linescriptstext processing

I'm trying to delete all lines comes after a specific pattern in files.

I have many files, which all have the same structure:

Example:

file1

line 1
...
line x "here there is a specific pattern"
...
EOF

file n

line 1
...
line x "here there is a specific pattern"
...
EOF

I tried to get a simple solution, but since I have many files, I proceed with a long way :p

The pattern appears one time in each file.

So, I got all lines number which contains this pattern, and save in one file.

this is my code:

count=$(ls -f path_to_folder/*.txt | wc -l)
echo "Number of txt file : $count"
###


    for ((i=1;i < $count+1 ;i++))

    {

    vt=$(grep -n PATTERN $i.txt | cut -d : -f 1)
    echo $vt >> PATTERN_line.txt

    }

Every line in PATTERN_line.txt contains the line number, in each file, where the pattern exists.

Now, I'm trying to use those numbers to delete all lines that come after the pattern to the file end.

This means I need to keep the file from the head to the patten line which must be included.

I appreciate your help

Best Answer

This is very trivial with text processing utilities. For example, using sed:

sed '1,/pattern/!d' file

Meaning, match every line from the first one to the one with pattern and delete all the non-matched lines. So, replace pattern with your pattern. If it contains /, you need to escape those characters. For example, if the pattern is pattern-with/character:

sed '1,/pattern-with\/character/!d' file

To actually edit files (rather than print the edited stream to stdout), you can use the -i flag:

sed -i '1,/pattern/!d' file

You can make a backup of the original file by adding an extension for the old file to -i. Take care here - you must not include a space before the extension.

sed -i.backup '1,/pattern/!d' file

sed takes multiple filename arguments. For example, to act on all the non-hidden files in the current directory you could use:

sed -i '1,/pattern/!d' *

sed

sed '1,/mail@server\.com/d'  # excluding the matched line
sed '/mail@server\.com/,$!d' # including the matched line

Explanations

1,/mail@server\.com/d – delete every line from line 1 to (,) mail@server.com
/mail@server\.com/,$!d – don't (!) delete every line from mail@server.com to (,) the end of the file ($), but everything else

Usage

sed '…' file > file2 # save output in file2
sed -i.bak '…' file  # alter file in-place saving a backup as file.bak
sed -i '…' file      # alter file in-place without backup (caution!)

awk

awk 'f;/mail@server\.com/{f=1}' # excluding the matched line
awk '/mail@server\.com/{f=1}f'  # including the matched line

Explanations

f – variable f, variables are 0 = false by default, awk prints nothing if the expression is false and just prints the line if the expression is true
/mail@server\.com/{f=1} – if mail@server.com is found set f=1, therefore rendering the whole expression true the next time f occurs in the expression

Usage

awk '…' file > file2                          # save output in file2
awk -iinplace -vINPLACE_SUFFIX=.bak '…' file  # alter file in-place saving a backup as file.bak
awk -iinplace '…' file                        # alter file in-place without backup (caution!)

Ubuntu – awk: print only lines that come after specific guard line pattern

Regex-oriented

$ awk '/^\|m\|/ {/\|head1\|/ ? p=1 : p=0} p' example_file.txt 
|m|head1|
|3,4,6|
|3e,2,23|
|m|head1|
|dljs,wqpw,2;a|
|dllw,w1p,1q;a|

or field-oriented

$ awk -F'|' '$2 == "m" {$3 == "head1" ? p=1 : p=0} p' example_file.txt 
|m|head1|
|3,4,6|
|3e,2,23|
|m|head1|
|dljs,wqpw,2;a|
|dllw,w1p,1q;a|

p is effectively a print flag.

Awk programs consist of pattern {action} pairs, in which action is executed on a record if pattern evaluates true (non-zero). You can omit pattern - in which case {action} is evaluated for every record - or omit {action} in which case awk applies the default pattern, which is to print the record: the latter is what's happening here.

Best Answer

Related Solutions

Ubuntu – How to delete lines from a file until a specific pattern

sed

Explanations

Usage

awk

Explanations

Usage

Ubuntu – awk: print only lines that come after specific guard line pattern

Related Question