So, I have this textfile, and it consists of mostly alphanumeric characters. It's a standard document. But since I copied it and pasted it from a PDF, there are page numbers in there. I don't much care for the occasional number that's not a page, so I figure I'll wipe them all out with sed
or tr
. Just marginally faster than find and replacing first zero, then one, then two, etc. in the GUI, after all.
So how do I do that?
Best Answer
To remove all digits, here are a few possibilities:
If you just want to get rid of the page numbers, there's probably a better regexp you can use, to recognize just those digits that are page numbers. For example, if the page numbers are always alone on a line except for whitespace, the following command will delete just the lines containing nothing but a number surrounded by whitespace:
(
\+
is a GNU extension; with somesed
implementations, you may need the longer standard alternative:\{1,\}
or use[0-9][0-9]*
).You don't need to use the command line for this, though. Any halfway decent editor has regexp search and replacement capabilities.