Biggest possible number for sort -n

sort

How to determine the "biggest number" so that sort -n will always put it at the end? I'm thinking of something along Inf in some languages but I'm not sure if anything like this exists for sort.


The background is that I'm sorting list of potentially non-existent paths by age so that

  • existent go first, from newest to oldest,

  • non-existent go last.

I'm using decoration approach and trying to put penalty on those unborns:

INFINITY=99999999999999   # ...close enough, right?

age_of() {
    if [ -e $1 ];
    then
        local ctime=$(stat -c %Z "$1" 2>/dev/null)
        local now=$(date +%s)
        echo $(($now - $ctime))
    else
        echo $INFINITY
    fi
}

cat - \
  | while read path;
    do
        echo "$(age_of $path) $path"
    done \
      | sort -n \
      | cut -d\  -f 2-

but obviously the comment is pretty naive; it's just a matter of time when even 99999999999999 will not be close enough. 😉

So is there a better value for INFINITY?

Best Answer

Not a POSIX solution, but GNU sort offers the -g option which supports a wider range of number specifications including infinity. From http://www.gnu.org/software/coreutils/manual/html_node/sort-invocation.html -

‘-g’
‘--general-numeric-sort’
‘--sort=general-numeric’

Sort numerically, converting a prefix of each line to a long double-precision
floating point number. See Floating point. Do not report overflow, underflow, or
conversion errors. Use the following collating sequence:

    Lines that do not start with numbers (all considered to be equal).
    NaNs (“Not a Number” values, in IEEE floating point arithmetic) in a
      consistent but machine-dependent order.
    Minus infinity.
    Finite numbers in ascending numeric order (with -0 and +0 equal).
    Plus infinity. 

Use this option only if there is no alternative; it is much slower than
--numeric-sort (-n) and it can lose information when converting to floating
point.

From my own tests it seems that any line beginning with Inf (any combination of upper/lower case) will appear after any numbers.

Failing that I don't think there are any character sequences that are reliably sorted after numbers using sort -n. GNU sort seems to treat all other sequences first as zero, placing the after negative numbers but before positive ones. What you could do, if it is timestamps that are being sorted, is to use the maximum value for a 64 bit timestamp plus one:

 9,223,372,036,854,775,808

This is a few more digits than you started out with!