Postgresql – How to improve the “algorithm” for figuring out the average spending on food per month

postgresqlquery

I have a database table keeping track of every grocery delivery. It's stored as timestamps and sums (EUR).

Currently, I do this:

SELECT (SUM("amount in EUR") / 3) 
FROM personal."food purchases" 
WHERE "purchase time" >= now() - INTERVAL '3 months';

Now, let's say that the current day is 1st of September 2020. This "algo" will count together all the deliveries from exactly three months back from the current moment, which I guess means from the first day of June, and then simply slash this price into 3 to symbolize "three months".

Sometimes, depending on the day, this will give a reasonable value. Other times, and I cannot fully describe/understand why, this gives either a too high or too low value. I suspect that the reason for this is that my food purchases/deliveries don't always happen on the exact same day, and vary quite a bit in price as well. Thus, the resulting sum, even though I do an interval of three months and then slash with three, and thus "should" give the average food cost "recently", is oftentimes misleading.

I can't tell for sure, but I think that it sometimes includes one purchase "too many", and sometimes "one too few". It might even be worse than that.

Can you help me improve my formula to be more "fair"? I want to know the average amount I pay for food each month, and I'm not convinced that I'm getting that value with my current query.

I think I need to drop the part which goes three months back from the current time, and instead somehow calculates the… last three… months… um… Frankly, I don't know what I need to do. I just don't think that the current method is correct.

I'd appreciate if your suggestion/solution stays as close as possible to my original query. I want to understand what is happening. I actually suspect that this is a very common issue and probably built into PG and all other database softwares.

Best Answer

I would first calculate the sum per month, then calculate the average from that:

select avg(amount) as avg_amount_per_month
from (
  SELECT date_trunc('month', "purchase time"), SUM("amount in EUR") as amount
  FROM personal."food purchases" 
  WHERE "purchase time" >= now() - INTERVAL '3 months'
  group by date_trunc('month', "purchase time")
) t
;

The inner query will return the amount for each month of the chose interval (yielding three rows), and the outer query will then calculate the average from that.