How to extract the numbers in the file using sed or any other tool

regular expressionsedtext processing

I have a file that has this format

[ 2014/05/01 10:48:26 | 13963 | DEBUG ] It took 11.16837501525879
seconds to complete the process

So I have thousands of lines like this and I would like to "extract" the 11.16837501525879 part
I tried:

 sed -e 's/^.* (\d+\.\d*)/\1/g' logfile.txt > out.txt  

but I get:

sed: -e expression #1, char 21: invalid reference \1 on `s' command's RHS  

What can I do here?

Best Answer

sed uses Basic Regular Expressions by default and BREs don't know about \d. Here are some other approaches:

  1. sed

    sed -r 's/.* ([0-9]+\.*[0-9]*).*?/\1/' logfile.txt > outfile.txt
    

    The -r is needed to avoid having to escape the parentheses.

  2. perl

    perl -pe 's/.* (\d+\.*\d*).*/$1/' logfile.txt > outfile.txt
    
  3. grep

    grep -Po '.* \K\d+\.*\d*' logfile.txt > outfile.txt
    

These all use your basic approach, which fill find all sets of digits in the line that are preceded by a space. Depending on how many sets of numbers can appear on the line, if your input lines are always of the format you show, a safer approach would be:

grep -Po 'took \K\d+\.*\d*' logfile.txt 
Related Question