Postgresql – Determine percentage from count() without cast issues

castcountdatatypespostgresql

I'm trying to run the following query to provide the % of rows in my patients table that have a value the refinst column. I keep getting a result of 0.

select (count (refinst) / (select count(*) from patients) * 100) as "Formula" 
from patients;

The table has 15556 rows, and select count(refinst) from patients tells me that 1446 of those have a value in the refinst column. The response I'd like to get from the query would be 30.62 (1446/15556*100=30.62XXXXX, rounded to two decimals).

I'm pretty sure it has something to do with the data type of the count results (integers I'm assuming). If I divide an integer by an integer and the result is less than 0 it is truncated to 0 correct? If that's the case, can someone show me how to cast the results of the counts as a number with 2 decimal places so that the result will be rounded to 2 decimal places as well?

I'm sure there's a better way to write this code than multiple count statements. I am looking for a more processor-efficient way to write this query in particular.

Best Answer

SELECT (count(refinst) * 100)::numeric / NULLIF(count(*), 0) AS refinst_pct
    -- count(refinst) * 100.0 / NULLIF(count(*), 0) AS refinst_pct  -- simpler
FROM   patients;
  • Do not use a subselect. Both aggregates can be derived from the same query. Cheaper.

  • Also, this is not a case for window functions, since you want to compute a single result, and not one result per row.

  • Cast to any numeric type that supports fractional digits, like @a_horse already explained.
    Since you want to round() to two fractional digits I suggest numeric (which is the same as decimal in Postgres).
    It's enough to cast one value involved in a calculation, preferably the first. Postgres automatically settles for the type that does not lose information.
    Or, simpler yet: since we multiply anyway, use a numeric constant that's coerced to numeric automatically because of the decimal point (100.0).

  • It's generally a good idea to multiply before you divide. This typically minimizes rounding errors and is cheaper.
    In this case, the first multiplication (count(refinst) * 100) can be computed with cheap and exact integer arithmetic. Only then we cast to numeric and divide by the next integer (which we do not cast additionally).

  • NULLIF(count(*), 0) prevents division by zero (raising an exception). We get NULL as (unknown) percentage if there are no rows at all.

Rounded to two fractional digits:

SELECT round((count(refinst) * 100)::numeric / NULLIF(count(*), 0), 2) AS refinst_pct
FROM   patients;