How to extract the numbers in the file using sed or any other tool

regular expressionsedtext processing

I have a file that has this format

[ 2014/05/01 10:48:26 | 13963 | DEBUG ] It took 11.16837501525879
seconds to complete the process

So I have thousands of lines like this and I would like to "extract" the 11.16837501525879 part
I tried:

 sed -e 's/^.* (\d+\.\d*)/\1/g' logfile.txt > out.txt

but I get:

sed: -e expression #1, char 21: invalid reference \1 on `s' command's RHS

What can I do here?

Best Answer

sed uses Basic Regular Expressions by default and BREs don't know about \d. Here are some other approaches:

sed

sed -r 's/.* ([0-9]+\.*[0-9]*).*?/\1/' logfile.txt > outfile.txt

The -r is needed to avoid having to escape the parentheses.

perl

perl -pe 's/.* (\d+\.*\d*).*/$1/' logfile.txt > outfile.txt

grep

grep -Po '.* \K\d+\.*\d*' logfile.txt > outfile.txt

These all use your basic approach, which fill find all sets of digits in the line that are preceded by a space. Depending on how many sets of numbers can appear on the line, if your input lines are always of the format you show, a safer approach would be:

grep -Po 'took \K\d+\.*\d*' logfile.txt

Related Solutions

How to delete multiple random lines from a text file using sed

You probably wanted to use RANDOM % 90 rather then &. That's where the zeroes come from (deleting line 1 is OK, on the next run, the lines will be numbered 1 .. 89).

There is a problem, though: The formula could generate the same number several times. To prevent that, use a different approach: shuffle the numbers and pick the first ten:

shuf -i1-90 -n10 | sed 's/$/d/' | sed -f- input > output

If you don't like sed generating a sed script, you can use printf, too:

sed -f <( printf %dd\;  $(shuf -i1-90 -n10) ) input > output

Best Answer

Related Solutions

How to delete multiple random lines from a text file using sed

Related Question