I've been working on a system where users can register as an user, create a book club and invite other people (members) to join. The user and member can all add books to the club, and also vote for books that other members have added. I recently tried to add a lot of data to check if the database is performing well, after which I found out that it takes a lot of time to actually get the data I like. I would like to get all books in a club, including their votes and the name of the members that have voted for it.
My database diagram (created via dbdiagram.io, check it out)
In order to freely query the database without too much hassle, I decided to use Hasura, an open source service that can create a GraphQL backend by just looking at the data structure (I am using PostgresQL). I Use the following query to get the data I want:
query GetBooksOfClubIncludingVotesAndMemberName {
books(
where: {
club_id: {_eq: "3"},
state:{_eq: 0 }
},
order_by: [
{ fallback : asc },
{ id: asc }
]
) {
id
isbn
state
votes {
member {
id
name
}
}
}
}
This query of course gets converted to a SQL statement
SELECT
coalesce(
json_agg(
"root"
ORDER BY
"root.pg.fallback" ASC NULLS LAST,
"root.pg.id" ASC NULLS LAST
),
'[]'
) AS "root"
FROM
(
SELECT
row_to_json(
(
SELECT
"_8_e"
FROM
(
SELECT
"_0_root.base"."id" AS "id",
"_0_root.base"."isbn" AS "isbn",
"_7_root.ar.root.votes"."votes" AS "votes"
) AS "_8_e"
)
) AS "root",
"_0_root.base"."id" AS "root.pg.id",
"_0_root.base"."fallback" AS "root.pg.fallback"
FROM
(
SELECT
*
FROM
"public"."books"
WHERE
(
(("public"."books"."club_id") = (('3') :: bigint))
AND (("public"."books"."state") = (('0') :: smallint))
)
) AS "_0_root.base"
LEFT OUTER JOIN LATERAL (
SELECT
coalesce(json_agg("votes"), '[]') AS "votes"
FROM
(
SELECT
row_to_json(
(
SELECT
"_5_e"
FROM
(
SELECT
"_4_root.ar.root.votes.or.member"."member" AS "member"
) AS "_5_e"
)
) AS "votes"
FROM
(
SELECT
*
FROM
"public"."votes"
WHERE
(("_0_root.base"."id") = ("book_id"))
) AS "_1_root.ar.root.votes.base"
LEFT OUTER JOIN LATERAL (
SELECT
row_to_json(
(
SELECT
"_3_e"
FROM
(
SELECT
"_2_root.ar.root.votes.or.member.base"."id" AS "id",
"_2_root.ar.root.votes.or.member.base"."name" AS "name"
) AS "_3_e"
)
) AS "member"
FROM
(
SELECT
*
FROM
"public"."members"
WHERE
(
("_1_root.ar.root.votes.base"."member_id") = ("id")
)
) AS "_2_root.ar.root.votes.or.member.base"
) AS "_4_root.ar.root.votes.or.member" ON ('true')
) AS "_6_root.ar.root.votes"
) AS "_7_root.ar.root.votes" ON ('true')
ORDER BY
"root.pg.fallback" ASC NULLS LAST,
"root.pg.id" ASC NULLS LAST
) AS "_9_root";
When executing this statement using EXPLAIN ANALYZE
in front of it, it tells me that it took approximately 9217 milliseconds to finish, check the analyze response below
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=12057321.11..12057321.15 rows=1 width=32) (actual time=9151.967..9151.967 rows=1 loops=1)
-> Sort (cost=12057312.92..12057313.38 rows=182 width=37) (actual time=9151.856..9151.865 rows=180 loops=1)
Sort Key: books.fallback, books.id
Sort Method: quicksort Memory: 72kB
-> Nested Loop Left Join (cost=66041.02..12057306.09 rows=182 width=37) (actual time=301.721..9151.490 rows=180 loops=1)
-> Index Scan using book_club on books (cost=0.43..37888.11 rows=182 width=42) (actual time=249.506..304.469 rows=180 loops=1)
Index Cond: (club_id = '3'::bigint)
Filter: (state = '0'::smallint)
-> Aggregate (cost=66040.60..66040.64 rows=1 width=32) (actual time=49.134..49.134 rows=1 loops=180)
-> Nested Loop Left Join (cost=0.72..66040.46 rows=3 width=32) (actual time=0.037..49.124 rows=3 loops=180)
-> Index Only Scan using member_book on votes (cost=0.43..66021.32 rows=3 width=8) (actual time=0.024..49.104 rows=3 loops=180)
Index Cond: (book_id = books.id)
Heap Fetches: 540
-> Index Scan using members_pkey on members (cost=0.29..6.38 rows=1 width=36) (actual time=0.005..0.005 rows=1 loops=540)
Index Cond: (id = votes.member_id)
SubPlan 2
-> Result (cost=0.00..0.04 rows=1 width=32) (actual time=0.000..0.000 rows=1 loops=540)
SubPlan 3
-> Result (cost=0.00..0.04 rows=1 width=32) (actual time=0.000..0.000 rows=1 loops=540)
SubPlan 1
-> Result (cost=0.00..0.04 rows=1 width=32) (actual time=0.001..0.002 rows=1 loops=180)
Planning Time: 0.788 ms
JIT:
Functions: 32
Options: Inlining true, Optimization true, Expressions true, Deforming true
Timing: Generation 4.614 ms, Inlining 52.818 ms, Optimization 113.442 ms, Emission 81.939 ms, Total 252.813 ms
Execution Time: 9217.899 ms
(27 rows)
With table sizes of:
relname | rowcount
--------------+----------
books | 1153800
members | 19230
votes | 3461400
clubs | 6410
users | 3
This takes way too long. In my previous design, I did not have any indexes, which caused it to be even slower. I have added indexes, but I'm still not quite happy about the fact that I'll have to wait that long. Is there anything that I can improve concerning the data structure or anything?
EDIT
Same select statement, but now using EXPLAIN (ANALYZE, BUFFERS)
as suggested:
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=12057321.11..12057321.15 rows=1 width=32) (actual time=8896.202..8896.202 rows=1 loops=1)
Buffers: shared hit=2392279 read=9470
-> Sort (cost=12057312.92..12057313.38 rows=182 width=37) (actual time=8896.097..8896.106 rows=180 loops=1)
Sort Key: books.fallback, books.id
Sort Method: quicksort Memory: 72kB
Buffers: shared hit=2392279 read=9470
-> Nested Loop Left Join (cost=66041.02..12057306.09 rows=182 width=37) (actual time=222.978..8895.801 rows=180 loops=1)
Buffers: shared hit=2392279 read=9470
-> Index Scan using book_club on books (cost=0.43..37888.11 rows=182 width=42) (actual time=174.471..214.000 rows=180 loops=1)
Index Cond: (club_id = '3'::bigint)
Filter: (state = '0'::smallint)
Buffers: shared hit=113 read=9470
-> Aggregate (cost=66040.60..66040.64 rows=1 width=32) (actual time=48.211..48.211 rows=1 loops=180)
Buffers: shared hit=2392166
-> Nested Loop Left Join (cost=0.72..66040.46 rows=3 width=32) (actual time=0.028..48.202 rows=3 loops=180)
Buffers: shared hit=2392166
-> Index Only Scan using member_book on votes (cost=0.43..66021.32 rows=3 width=8) (actual time=0.018..48.187 rows=3 loops=180)
Index Cond: (book_id = books.id)
Heap Fetches: 540
Buffers: shared hit=2390546
-> Index Scan using members_pkey on members (cost=0.29..6.38 rows=1 width=36) (actual time=0.004..0.004 rows=1 loops=540)
Index Cond: (id = votes.member_id)
Buffers: shared hit=1620
SubPlan 2
-> Result (cost=0.00..0.04 rows=1 width=32) (actual time=0.000..0.000 rows=1 loops=540)
SubPlan 3
-> Result (cost=0.00..0.04 rows=1 width=32) (actual time=0.000..0.000 rows=1 loops=540)
SubPlan 1
-> Result (cost=0.00..0.04 rows=1 width=32) (actual time=0.008..0.008 rows=1 loops=180)
Planning Time: 0.400 ms
JIT:
Functions: 32
Options: Inlining true, Optimization true, Expressions true, Deforming true
Timing: Generation 2.060 ms, Inlining 9.923 ms, Optimization 94.927 ms, Emission 68.793 ms, Total 175.702 ms
Execution Time: 8898.360 ms
(35 rows)
EDIT 2:
Using select * from pg_prepared_xacts;
and select * from pg_stat_activity;
as suggested in an answer. The first statement does not show any rows, and for the second one I did not notice any old xact_start times, and this was done after previously (yesterday) running VACUUM FULL votes
. Running VACUUM FULL votes
unfortunately does not fix the problem.
Output of statements:
booky=# select * from pg_prepared_xacts;
transaction | gid | prepared | owner | database
-------------+-----+----------+-------+----------
(0 rows)
booky=# select * from pg_stat_activity;
datid | datname | pid | usesysid | usename | application_name | client_addr | client_hostname | client_port | backend_start | xact_start | query_start | state_change | wait_event_type | wait_event | state | backend_xid | backend_xmin | query | backend_type
--------+---------+-----+----------+----------+------------------+-------------+-----------------+-------------+-------------------------------+-------------------------------+-------------------------------+-------------------------------+-----------------+---------------------+--------+-------------+--------------+---------------------------------+------------------------------
| | 31 | | | | | | | 2020-04-05 08:41:47.959657+00 | | | | Activity | AutoVacuumMain | | | | | autovacuum launcher
| | 33 | 10 | postgres | | | | | 2020-04-05 08:41:47.959964+00 | | | | Activity | LogicalLauncherMain | | | | | logical replication launcher
169575 | booky | 48 | 10 | postgres | psql | | | -1 | 2020-04-05 10:05:20.847798+00 | 2020-04-05 10:07:47.534381+00 | 2020-04-05 10:07:47.534381+00 | 2020-04-05 10:07:47.534382+00 | | | active | | 15265333 | select * from pg_stat_activity; | client backend
| | 29 | | | | | | | 2020-04-05 08:41:47.959346+00 | | | | Activity | BgWriterHibernate | | | | | background writer
| | 28 | | | | | | | 2020-04-05 08:41:47.959688+00 | | | | Activity | CheckpointerMain | | | | | checkpointer
| | 30 | | | | | | | 2020-04-05 08:41:47.959501+00 | | | | Activity | WalWriterMain | | | | | walwriter
(6 rows)
Best Answer
49.104 * 180 = 8839, which is substantially all of your time. Most likely most of this time is going to IO to read random pages from the table (if you turn on
track_io_timings
and then doEXPLAIN (ANALYZE, BUFFERS)
we would have a definitive answer to that).If you vacuum "votes", and so get rid of the heap fetches, it would almost surely solve the problem.
If this was done after the VACUUM was done, then you probably have some kind of long-running transaction being held open, which is preventing VACUUM from doing its job effectively. Also, hitting 2,390,546 buffers to get 540 row seems incredibly outlandish. Again, that could be due to some long-open transaction causing massive bloat in your index and/or table.
Does
select * from pg_prepared_xacts;
show any rows? Doesselect * from pg_stat_activity
show any old times forxact_start
? If neither of those, then can you do aVACUUM FULL votes
and see if that fixes the problem?