Unfortunately, MySQL is quite notorious for subquery issues, particularly with non-SELECTs queries. I once addressed this back on Feb 22, 2011
: Problem with MySQL subquery
In that old post I found documentation on how MySQL handles subqueries. I have tried to keep up with this because I last edited my post on Oct 20, 2012
with the latest documentation.
In brief, subquery optimization may have rows disappear intermittently for the sake of getting the EXPLAIN plan figured out. You can read the documentation and learn about it if you can your wrap head around it (not going through that again).
Now for your question...
The most sensible approach, no JOINs, no sub-SELECTs, is the following:
DELETE FROM TABLE WHERE value_was IS NULL OR value_was <= value_now;
Will you look at that. It's your query from the question. Why suggest your original idea ?
It is a full table scan IN ONE PASS. Any other approach can potentially double the work (or triple it if you try to get indexes involved this late in the game). Running it this way also delays the need to defragment the table.
If you want to delete and defragment, here are two options.
OPTION #1
MyISAM
DELETE FROM `TABLE` WHERE value_was IS NULL OR value_was <= value_now;
ALTER TABLE `TABLE` ENGINE=MyISAM;
InnoDB
DELETE FROM `TABLE` WHERE value_was IS NULL OR value_was <= value_now;
ALTER TABLE `TABLE` ENGINE=InnoDB;
OPTION #2
DELETE FROM `TABLE` WHERE value_was IS NULL OR value_was <= value_now;
CREATE TABLE `NEWTABLE`
SELECT * FROM `TABLE`
WHERE NOT (value_was IS NULL OR value_was <= value_now);
DROP TABLE `TABLE`;
ALTER TABLE `NEWTABLE` RENAME `TABLE`;
CAVEAT
Before you do anything, run this count
SELECT COUNT(1) INTO @Count_All FROM `TABLE`;
SELECT COUNT(1) INTO @Count_Zap FROM `TABLE`
WHERE value_was IS NULL OR value_was <= value_now;
SET @DeletePct = @Count_Zap * 100 / @Count_All;
SELECT @DeletePct;
@DeletePct
is the Percentage of the Table that will be deleted if you run the DELETE
.
If the Percentage is too low for you, then DELETE FROM
TABLEWHERE value_was IS NULL OR value_was <= value_now;
is all you need. Defragmentation can wait. Otherwise, you may choose one of the options or live with the table's row fragmentation.
On a side note, if you wish employ the use of indexes, please so after defragmenting the table.
You can use window functions to achieve your goal. lag()
and lead()
are ones which can help you in a query like
SELECT lag(di_timestamp) OVER ordering, lead(di_timestamp) OVER ordering
FROM data_item
WHERE fk_fc_id IN (35246,35247)
WINDOW ordering AS (ORDER BY di_timestamp
RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW);
This will return the previous and the next timestamps, if there is any.
Best Answer
There are many possible query styles, most will readily use your PK index on
(sensor_id, time)
as it fits the task. (Postgres can read indexes backwards practically as fast.) This should be near perfect:db<>fiddle here
LEFT JOIN .. ON true
keeps sensors without any data entries in the result - with NULL values in place of values.Related:
Since you are on Postgres 11, a covering index might pay:
But it makes the index bigger and writes to the table more expensive, and your names indicate a write-heavy table. And while you only query for few rows at a time, queries don't get much faster anyway. So probably best the way you have it. Related: