Ubuntu – How to find lines matching a pattern and delete them

command linetext processing

In a file with lots of lines I want to delete lines that starts with HERE IT IS.

How can I do this using only command-line tools?

Best Answer

Try sed:

sed -i '/^HERE IT IS/d' <file>

WARNING: Its better to take a backup when using -i switch of sed:

sed -i.bak '/^HERE IT IS/d' <file>

The original file will remain as <file>.bak and the modified file will be <file>.

Related Solutions

Ubuntu – How to delete random lines from a file

You can probably solve it more efficiently than with a for-loop that needs to process the whole file once per line to remove.

filename="/PATH/TO/FILE"
number=5

line_count="$(wc -l < "$filename")"
line_nums_to_delete="$(shuf -i "1-$line_count" -n "$number")"
sed_script="$(printf '%dd;' $line_nums_to_delete)"

sed -i.bak -e "$sed_script" "$filename"

Or in one line (after defining the filename and number variables or replacing them manually):

sed -i.bak -e "$(printf '%dd;' $(shuf -i "1-$(wc -l < "$filename")" -n "$number"))" "$filename"

The -i.bak switch tells sed to edit/replace the input file immediately, but keep a backup copy of the original data, named like the input file but with .bak appended to the file name. If you don't want it to make a copy, just write -i.

Btw, you don't have to use variables as I did. You can also directly replace "$number" and both occurrences of "$filename" with the appropriate values. I just did it this way for clarity.

To break down and explain the rest of the command:

sed -e "SCRIPT" "$filename"

runs the text processing tool sed on the file specified by the filename variable, applying the instructions given as SCRIPT argument.

Our SCRIPT is dynamically generated in the lines above it, which run commands and assign their outputs to variables. Here we use these commands:

wc -l < "$filename" reads in the file specified by the filename variable and outputs the number of lines this file contains.
- In your case, this should return roughly 10000 according to the size you mentioned in the question.
shuf -i "1-$line_count" -n "$number returns as many unique random numbers as specified by the number variable in the range 1 to $line_count (both boundaries inclusive).
- For example, shuf -i 1-6 -n 2 would emulate throwing two regular six-sided dies.
printf '%dd;' ARGUMENTS returns a formatted string, taking in all ARGUMENTS (not quoted this time to treat each random number as a separate argument). The format string %dd; will be repeated while there are arguments left, and %d will be replaced with the argument represented as a decimal number.
- Therefore, e.g. an input of 1 7 42 would result in an output of 1d;7d;42d;.

The resulting $sed_script is finally our SCRIPT for sed. A plain number is treated as address, i.e. the line number on which to apply an action, starting at 1 for the first line of the input file. d is the command to delete the specified line, and ; separates multiple sed script commands.

All together, the whole command first examines your input file as specified in the filename variable and counts its lines. Then it generates number many unique random numbers in the range 1 to the number of lines and constructs a sed script out of these to delete each mentioned random line. Finally sed runs that script on the file, modifying it.

Best Answer

Related Solutions

Ubuntu – How to delete random lines from a file

Related Question