I'm trying to automatically calculate a table for user retention, so that the end result is something like:
Week | %
0 | 100
1 | 50
2 | 35
For now I'm not bothering with the % and just trying to get a count of users, i.e. users that came back on week 1, week 2, etc.
I have an accounts
table and an action
table. Currently I don't have indexes in either of these. Both have a user_email
that stores the associated user, and both have a timestamp
field that's a TIMESTAMP WITH TIME ZONE
– for accounts
this is the account creation date and for action
it's the timestamp of the action.
The two tables are currently super simple:
accounts Table:
timestamp | timestamp with time zone
user_email |character varying
action Table:
timestamp | timestamp with time zone
user_email |character varying
That's it, no keys or anything else.
I managed to piece together a query that seems to do this calculation:
WITH weeks AS (
SELECT (n || 'week')::interval as week from generate_series(0, 3) AS n
)
SELECT
week,
(SELECT count(*) FROM accounts WHERE user_email IN
(SELECT user_email FROM action
WHERE age(action.timestamp, accounts.timestamp) >= week
AND age(action.timestamp, accounts.timestamp) < week + '1 week'::interval))
AS "Returning Users"
FROM weeks;
The output here is something like:
week | Returning Users
0 days | 100
7 days | 50
14 days | 35
21 days | 25
The thing is, this takes super long to calculate (with 1300 users and about 100000 actions it takes a few minutes). I'm also convinced there's a more efficient way to calculate this, or that I can at least add indexes somewhere.
Unfortunately my SQL knowledge is quite limited and after looking around for a while I'm still not sure where to start. I think I can add indexes to the timestamp
columns? And I'm not sure my way of checking if timestamps fall in certain ranges is correct.
Best Answer
It's a bit hard to tell how your tables are linked together, but I think the following should be more efficient: