Print only lines that are completely numeric

awknumeric datatext processing

I'd like to filter through a text file and only print the lines where each column is a valid floating point number. For example:

3 6 2 -4.2 21.2 
3 x 4.2 21.2 
3 2 2.2.2

Only the first line would pass as x, nor 2.2.2 are valid floats. I can write a python script that simply .splits() and runs a try/except block over each part, but this is slow for larger files. The input file has an unknown variable length number of columns and no scientific notation will be used. Is there an awk solution?

Best Answer

awk '
    # skip any obvious stuff
    /[^0-9. -]/ {next}
    {
        # test each field for a number
        for (i=1; i<=NF; i++) 
            if ($i + 0 != $i)
                next
        print
    }
'

This will break for valid numbers in scientific notation: 1.2e1 == 12

Related Question