PostgreSQL – Sum of Double Precision Gives Incorrect Results

postgresqlsum

I've been making a sum of a column containing about 500.000 records of double precision numbers. All numbers in the database should normally be of two ciphers behind the period. However when making the sum I get 6 numbers after the period: 123123123.549977

Either I have bad data in my database where there are records with more numbers after the period or there is something I'm missing about the sum function.

So my questions are:

  • Does the sum function have any rounding properties which may cause this?
  • Is there a way to select all records containing more than two numbers after the period?

Best Answer

This is a common problem with floating point numbers everywhere.

Floating point numbers stored in computer systems should only ever be considered approximations because there are numbers easy to represent in decimal that come out longer than the available precision (sometimes they are in fact never ending) when converted to binary. See ypercube's links and https://stackoverflow.com/questions/588004/is-floating-point-math-broken amongst many other references, for more detail.

The most common example given (as seen in that StackOverflow link) is 0.1+0.2 not resulting in exactly 0.3. You have to apply extra rounding or flexible bounds checking (in both cases reducing the effective precision) to get the behaviour you are expecting.

As your data has a fixed number of decimal places (or a fixed maximum) at 2, you would be much better of using the fixed place decimal/numeric types or similar. These are actually stored and processed as scaled integers avoiding the need for any floating point representation internally so avoiding the approximation problem (all integers can be represented accurately in both base 10 and base 2, assuming you have enough digits/bits). For instance 0.1 and 0.2 might be stored as 1000 and 2000, so the addition results in 3000 which when converted to a string for display becomes 0.3 not the 0.3000000004 you might get out of a floating point calculation, and of course compares accurately to 3 in the same type which would be scaled to 3000.

Is there a way to select all records containing more than two numbers after the period?

You could convert to string and count the characters after the decimal point (or comma depending on your system local). In TSQL this would find values that are have ended up with more than two decimal places after being stored in a binary float format then translated back to decimal:

SELECT *
FROM   myTable
WHERE  CHARINDEX('.', REVERSE(CAST(floatValue AS NVARCHAR(MAX)))>3

(you may need to tweak function names and related syntax for postgres).