Is this normal for a forced join order to make the query estimates to
be completely inaccurate (and thus query times unpredictable)?
The use of FORCE ORDER isn't making estimates inaccurate, the deletion of rows did. Forcing an update of statistics on the table may improve the estimation accuracy.
Should I just expect that I'll have to either accept sub-optimal query
performance, or watch it like a hawk and frequently manually edit
query hints? Or maybe hint every join as well? .3s to 2s is a big hit
to take.
Preferable would be to ensure the optimiser is given the information it needs to generate the best plan, without using the FORCE ORDER hint. By doing so, it should cope better with changes to the underlying data distribution without requiring manual intervention. That said, if the nature of the data is such that cardinality could vary significantly hour by hour or day by hour, consider using a plan guide to ensure the plan is fixed.
Is it obvious why the optimizer blew up after deleting rows? For
example, "yes, it took a sample scan, and because I archived most of
the rows earlier in the data history the sample yielded sparse
results, so it underestimated the need for a sorted hash operation"?
You didn't mention the row counts in the problem tables but it's likely that the deletions either:
- didn't remove enough rows to to trigger a statistics update. This should occur when 20% of rows have been modified but there is the option to use trace flag 2371 to enable a dynamic threshold.
- did trigger a statistics update but the sample gathered wasn't representative. Correct this by running a manual update WITH FULLSCAN.
You could also be running into good old fashioned parameter sniffing problems, for which there are myriad options to work around. WITH RECOMPILE might be an expensive option to specify with a query this large but it's worth investigating at both procedure and statement level.
Query
Your query is forced to scan the whole table (or the whole index). Every row could be another distinct unit. The only way to substantially shorten the process would be a separate table with all available units - which would help as long as there are substantially fewer units than entries in all_units
.
Since you have ~ 11k units (added in comment) for 25M entries, this should definitely help.
Depending on frequencies of values, there are a couple of query techniques to get your result considerably faster:
- recursive CTE
JOIN LATERAL
- correlated subquery
Details in this related answer on SO:
Only needing the implicit index of the primary key on (unit_id, unit_timestamp)
, this query should do the trick, using an implicit JOIN LATERAL
:
SELECT u.unit_id, a.max_ts
FROM unit u
, (SELECT unit_timestamp AS max_ts
FROM all_units
WHERE unit_id = u.unit_id
ORDER BY unit_timestamp DESC
LIMIT 1
) a;
Excludes units without entry in all_units
, like your original query.
Or a lowly correlated subquery (probably even faster):
SELECT u.unit_id
, (SELECT unit_timestamp
FROM all_units
WHERE unit_id = u.unit_id
ORDER BY unit_timestamp DESC
LIMIT 1) AS max_ts
FROM unit u;
Includes units without entry in all_units
.
Efficiency depends on the number of entries per unit. The more entries, the more potential for one of these queries.
In a quick local test with similar tables (500 "units", 1M rows in big table), the query with correlated subqueries was ~ 500x faster than your original. Index-only scans on the PK index of the big table vs. sequential scan in your original query.
Since your table tends to get even larger rapidly
, a materialized view is probably not an option.
There is also DISTINCT ON
as another possible query technique, but it's hardly going to be faster than your original query, so not the answer you are looking for. Details here:
Index
Your partial_idx
:
CREATE INDEX partial_idx ON all_units (unit_id, unit_timestamp DESC);
is not in fact a partial index and also redundant. Postgres can scan indexes backwards at practically the same speed, the PK serves well. Drop this additional index.
Table layout
A couple of points for your table definition.
CREATE TABLE all_units (
unit_timestamp timestamp,
unit_id int4,
lon float4,
lat float4,
speed float4,
status varchar(255), -- might be improved.
PRIMARY KEY (unit_id, unit_timestamp)
);
timestamp(6)
doesn't make much sense, it's effectively the same as just timestamp
, which already saves a maximum of 6 fractional digits.
I switched positions of the first two columns to save 4 bytes of padding, which amounts to ~ 100 MB for 25M rows (exact result depends on status
). Smaller tables are typically faster for everything.
If status
isn't free text, but some kind of standardized note, you could replace it with something a lot cheaper. More about varchar(255)
in Postgres.
Server configuration
You need to configure your server. Most of your settings seem to be conservative defaults. 1 MB on shared_buffers
or work_mem
seems way to low for an installation with millions of rows. And random_pare_cost = 4
is to high for any modern system with plenty of RAM. Start with the manual and the Postgres Wiki:
Best Answer
It's hard to say much given your need to keep your code and data confidential, but sometimes you can get better performance just by trying equivalent rewrites. The following query should return the same results but it's very likely to have a different query plan:
As you said in chat, this rewrite finishes in 6 seconds which is a decent improvement over the original 20 seconds you were seeing with
<>
.