Text Processing – Tool to Convert Byte-Count to Human-Readable Units

text processingunitsUtilities

Is there a standard tool which converts an integer count of Bytes into a human-readable count of the largest possible unit-size, while keeping the numeric value between 1.00 and 1023.99 ?

I have my own bash/awk script, but I am looking for a standard tool, which is found on many/most distros… something more generally available, and ideally has simple command line args, and/or can accept piped input.

Here are some examples of the type of output I am looking for.

    1    Byt  
  173.00 KiB  
   46.57 MiB  
    1.84 GiB  
   29.23 GiB  
  265.72 GiB  
    1.63 TiB  

Here is the bytes-human script (used for the above output)

awk -v pfix="$1" -v sfix="$2" 'BEGIN { 
      split( "Byt KiB MiB GiB TiB PiB", unit )
      uix = uct = length( unit )
      for( i=1; i<=uct; i++ ) val[i] = (2**(10*(i-1)))-1
   }{ if( int($1) == 0 ) uix = 1; else while( $1 < val[uix]+1 ) uix--
      num = $1 / (val[uix]+1)
      if( uix==1 ) n = "%5d   "; else n = "%8.2f"
      printf( "%s"n" %s%s\n", pfix, num, unit[uix], sfix ) 
   }'

Update  Here is a modified version of Gilles' script, as described in a comment to his answer ..(modified to suit my preferred look).

awk 'function human(x) {
         s=" B   KiB MiB GiB TiB EiB PiB YiB ZiB"
         while (x>=1024 && length(s)>1) 
               {x/=1024; s=substr(s,5)}
         s=substr(s,1,4)
         xf=(s==" B  ")?"%5d   ":"%8.2f"
         return sprintf( xf"%s\n", x, s)
      }
      {gsub(/^[0-9]+/, human($1)); print}'

Best Answer

No, there is no such standard tool.

Since GNU coreutils 8.21 (Feb 2013, so not yet present in all distributions), on non-embedded Linux and Cygwin, you can use numfmt. It doesn't produce exactly the same output format (as of coreutils 8.23, I don't think you can get 2 digits after the decimal points).

$ numfmt --to=iec-i --suffix=B --padding=7 1 177152 48832200 1975684956
     1B
 173KiB
  47MiB
 1.9GiB

Many older GNU tools can produce this format and GNU sort can sort numbers with units since coreutils 7.5 (Aug 2009, so present on modern non-embedded Linux distributions).


I find your code a bit convoluted. Here's a cleaner awk version (the output format isn't exactly identical):

awk '
    function human(x) {
        if (x<1000) {return x} else {x/=1024}
        s="kMGTEPZY";
        while (x>=1000 && length(s)>1)
            {x/=1024; s=substr(s,2)}
        return int(x+0.5) substr(s,1,1)
    }
    {sub(/^[0-9]+/, human($1)); print}'

(Reposted from a more specialized question)