I'm building a messaging system, and trying to get a list of messages and their replies; and also join the participants of the message threads.
So I'm shooting for messages = [{id, subject, body, replies: [..], users: [..]}]
SELECT messages.*,
to_json(array_agg(users)) users,
to_json(array_agg(replies)) replies
FROM messages
LEFT JOIN (SELECT id, fullname, email, company FROM users) users
ON users.id = messages.user_id OR users.id = messages.to
LEFT JOIN (SELECT * FROM messages ORDER BY created_at) replies
ON replies.message_id = messages.id
WHERE messages.to = :to OR (messages.to IS NOT NULL AND messages.user_id = :to)
GROUP BY messages.id;
I've tried various other queries, this gets me the closest. The problem is that for each message, all the replies are repeated once. I.e., if there are 3 replies in a thread, I get 6 (2x each). Any ideas what might be wrong?
I am using Postgres 9.5.
Best Answer
The basic problem is this:
Here is one way to avoid this problem and improve some other details while being at it:
Use json_agg() instead of
to_json(array_agg())
. Simpler, faster.Aggregate the subquery on
users
right away in aLATERAL
join, thus avoiding the main problem. Related:This:
can be expressed simpler:
This:
could be shortened to:
But a single
ORDER BY
in a subquery is typically faster than per-aggregateORDER BY
.