Text Processing – Tool to Convert Byte-Count to Human-Readable Units

text processingunitsUtilities

Is there a standard tool which converts an integer count of Bytes into a human-readable count of the largest possible unit-size, while keeping the numeric value between 1.00 and 1023.99 ?

I have my own bash/awk script, but I am looking for a standard tool, which is found on many/most distros… something more generally available, and ideally has simple command line args, and/or can accept piped input.

Here are some examples of the type of output I am looking for.

    1    Byt  
  173.00 KiB  
   46.57 MiB  
    1.84 GiB  
   29.23 GiB  
  265.72 GiB  
    1.63 TiB  

Here is the bytes-human script (used for the above output)

awk -v pfix="$1" -v sfix="$2" 'BEGIN { 
      split( "Byt KiB MiB GiB TiB PiB", unit )
      uix = uct = length( unit )
      for( i=1; i<=uct; i++ ) val[i] = (2**(10*(i-1)))-1
   }{ if( int($1) == 0 ) uix = 1; else while( $1 < val[uix]+1 ) uix--
      num = $1 / (val[uix]+1)
      if( uix==1 ) n = "%5d   "; else n = "%8.2f"
      printf( "%s"n" %s%s\n", pfix, num, unit[uix], sfix ) 

Update  Here is a modified version of Gilles' script, as described in a comment to his answer ..(modified to suit my preferred look).

awk 'function human(x) {
         s=" B   KiB MiB GiB TiB EiB PiB YiB ZiB"
         while (x>=1024 && length(s)>1) 
               {x/=1024; s=substr(s,5)}
         xf=(s==" B  ")?"%5d   ":"%8.2f"
         return sprintf( xf"%s\n", x, s)
      {gsub(/^[0-9]+/, human($1)); print}'

Best Answer

No, there is no such standard tool.

Since GNU coreutils 8.21 (Feb 2013, so not yet present in all distributions), on non-embedded Linux and Cygwin, you can use numfmt. It doesn't produce exactly the same output format (as of coreutils 8.23, I don't think you can get 2 digits after the decimal points).

$ numfmt --to=iec-i --suffix=B --padding=7 1 177152 48832200 1975684956

Many older GNU tools can produce this format and GNU sort can sort numbers with units since coreutils 7.5 (Aug 2009, so present on modern non-embedded Linux distributions).

I find your code a bit convoluted. Here's a cleaner awk version (the output format isn't exactly identical):

awk '
    function human(x) {
        if (x<1000) {return x} else {x/=1024}
        while (x>=1000 && length(s)>1)
            {x/=1024; s=substr(s,2)}
        return int(x+0.5) substr(s,1,1)
    {sub(/^[0-9]+/, human($1)); print}'

(Reposted from a more specialized question)