Sed or tr one-liner to delete all numeric digits

regular expressionsedtext processing

So, I have this textfile, and it consists of mostly alphanumeric characters. It's a standard document. But since I copied it and pasted it from a PDF, there are page numbers in there. I don't much care for the occasional number that's not a page, so I figure I'll wipe them all out with sed or tr. Just marginally faster than find and replacing first zero, then one, then two, etc. in the GUI, after all.

So how do I do that?

Best Answer

To remove all digits, here are a few possibilities:

tr -d 0-9 <old.txt >new.txt
tr -d '[:digit:]' <old.txt >new.txt
sed -e 's/[0-9]//g' <old.txt >new.txt

If you just want to get rid of the page numbers, there's probably a better regexp you can use, to recognize just those digits that are page numbers. For example, if the page numbers are always alone on a line except for whitespace, the following command will delete just the lines containing nothing but a number surrounded by whitespace:

sed -e '/^ *[0-9]\+ *$/d' <old.txt >new.txt

(\+ is a GNU extension; with some sed implementations, you may need the longer standard alternative: \{1,\} or use [0-9][0-9]*).

You don't need to use the command line for this, though. Any halfway decent editor has regexp search and replacement capabilities.

Related Solutions

Sed one-liner to delete any line that begins with a digit

sed -e '/^[0-9]/d' filename > filename.new

or to modify in place

sed -i -e '/^[0-9]/d' filename

Sed – One-Liner to Delete Everything Between Brackets

Replace [some text] by the empty string. Assuming you don't want to parse nested brackets, the some text can't contain any brackets.

sed -e 's/\[[^][]*\]//g'

Note that in the bracket expression [^][] to match anything but [ or ], the ] must come first. Normally a ] would end the character set, but if it's the first character in the set (here, after the ^ complementation character), the ] stands for itself.

If you do want to parse nested brackets, or if the bracketed text can span multiple lines, sed isn't the right tool.

Best Answer

Related Solutions

Sed one-liner to delete any line that begins with a digit

Sed – One-Liner to Delete Everything Between Brackets

Related Question