Oracle – NUMBER vs BINARY_FLOAT for Allocated Space

disk-spaceoracletablespaces

I have a 10 billion records table so I need to improve space allocation. My table has only fields of type floating point (single-precision). What is the best choice, NUMBER or BINARY_FLOAT or something else ?

Best Answer

It depends on your data (and this assumes that you don't care about the different behaviors of the two data types). You can put together a test with sample data that is similar to what you're actually storing to find out what approach will use less space.

I'll create a table num_test with two different columns, one declared as a number and one as a binary_float.

create table num_test (
  num_col number,
  flt_col binary_float
);

If I populate it with 100,000 rows of data in increments of 0.001

begin
  for i in 1 .. 100000
  loop
    insert into num_test( num_col, flt_col )
      values( i/1000, i/1000 );
  end loop;
end;

then I can see the minimum, maximum, and average size required to store the values

select min( vsize(num_col) ), max( vsize(num_col) ), avg( vsize(num_col) ),
       min( vsize(flt_col) ), max( vsize(flt_col) ), avg( vsize(flt_col) )
  from num_test;

If you run that, you'll see that the number column is averaging 3.89 bytes/ row (with a min of 2 and a max of 4) while every binary_float required 4 bytes of storage. So there is a small savings to using number in this case.

But is that the whole story? Nope. If we use different sample data, we get very different results. Let's run the same test but divide everything by 3

begin
  for i in 1 .. 100000
  loop
    insert into num_test( num_col, flt_col )
      values( i/1000/3, i/1000/3 );
  end loop;
end;

Now, if you run the same vsize query, you'll see that the number column takes up to 21 bytes of storage and averages 7.69 bytes for the 200,000 rows. The flt_col is still using a fixed 4 bytes for every row.

Depending on your data, you may find that one or the other approach uses more space. Of course, where there are substantial differences in space utilization, there is the potential for more substantial rounding issues when you move to floating point numbers.

Note as well that the results may be version-dependent. I did my test on an 11.2 database where the documentation indicates that binary_float always takes 4 bytes of storage. In 10.2, the documentation indicates that binary_float took 5 bytes due to the inclusion of an extra length indicator. It's not obvious to me why a 32-bit value would ever need a length byte so I'm hard-pressed to understand why an earlier version of Oracle might have required the fifth byte, but the documentation is consistent that it did.