The bad plan is probably a culmination of many problems. Which means that there are many ways of tackling it. My guess is that the culmination of problems causes two plans to look falsely close to each other in cost, and then the differences in memory setting (probably effective_cache_size
) between master and slave is the straw that broke the camel's back and tips the master over to using the wrong one.
The simplest starting point for tackling it may be this one:
" -> Index Scan using idx_own_spas on spatial_artifact this_ (cost=0.43..23351.70 rows=6841 width=604) (actual time=0.037..2.119 rows=618 loops=1)"
" Index Cond: (propietari = 7649)"
This is simple because the scan runs to completion, and only has a single index qualification, and yet the estimate is off by a factor of 10. Since your statistic are up to date, the problem must be an inadequate default_statistics_target. You can increase that globally, or increase it just for the column propietari
using SET STATISTICS, and then re-analyzing the table. If this estimation problem were fixed, the master would almost certainly switch to that faster plan as it would appear even faster yet.
Another problem is that it thinks rows meeting the criteria ((group_id IS NULL) AND (propietari = 7649))
are distributed randomly over the values of this_.id
, which they apparently are not. That causes it to think the LIMIT clause will kick in much sooner than it actually does, so making the slow plan look like it will be faster than it will be. Unfortunately, there is nothing simple you can do to make this estimation better. But, if you create the right index, you could provide a third plan which actually is much faster, and also looks much faster, than either of the two current ones.
That index would be:
CREATE INDEX idx_own_spas
ON spatial_artifact
USING btree
(propietari,id)
WHERE group_id IS NULL;
You can replace the existing similar index but without the id
column with this index, so it should not take up much additional space or maintenance overhead.
Finally, you can change effective_cache_size
within each session, so it would be easy to change this value on the master and on the slave to see if that is truly which is tipping the plan over from the good plan to the bad plan. But if that is the case, that still doesn't tell you what to do about it. Just because a certain value causes a bad planner choice for one query doesn't mean it is not set properly for the system as a whole. So hopefully one of the early two methods will work.
That time is spent by pgAdmin to pack and render data and is not the time spent by Postgres to complete the query execution. Why are you fetching 250.000 rows into pgAdmin? If you need to export the table to a plain-text file (like a CSV with header) you can execute this query:
COPY (
SELECT
mobile_state_message.latitude,
mobile_state_message.longitude
FROM
public.mobile_state_message
WHERE
topic_name = 'MyCar'
)
TO '/path/to/file.csv' WITH (FORMAT CSV, HEADER);
Remember that /path/to/
must be owned by postgres
user, so run chown -R postgres /path/to/
.
This way you can use wathever to analyze the table (if you need to).
Best Answer
It is a very broad question, even more so as you give absolutely no details about your setup and kind of database (volume, type of queries, active connections, size of RAM, dedicated server or not, etc.)
You can start by enabling PostgreSQL to log slow queries, see the
log_min_duration_statement
in the configuration. That will give you historical data that you would then be able to analyze and maybe correlate with other things (like from the list of @VĂ©race)You will then have various tools to help, as described on https://wiki.postgresql.org/wiki/Logging_Difficult_Queries :