According to your comment on a deleted post, you load all rows into a java module to conduct a search there. But searching is better done in the database itself - that's what a database is good at. Only return the rows you actually need.
If you really need all rows, there are many little things to make this faster. 1M rows will never be very fast, though.
Postgres 9.2 or later
You can make the index covering by appending fcv_id
:
CREATE INDEX factura_venta_orden
ON factura_venta (fcv_fecha_comprobante, fcv_numero_comprobante, fcv_id);
This way, provided the table isn't updated too much, Postgres can retrieve results with an index-only scan.
The additional column comes last since it does not contribute to the sort order. Explanation:
In Postgres 11 or later you could make that:
CREATE INDEX factura_venta_orden
ON factura_venta (fcv_fecha_comprobante, fcv_numero_comprobante) INCLUDE (fcv_id);
CLUSTER
/ pg_repack
I see you already found CLUSTER
. You are aware that this is a one-time operation, that should help your cause, but needs to be re-run after enough updates?
There is also the community tool pg_repack
as replacement for VACUUM FULL
/ CLUSTER
.
work_mem
This line in your EXPLAIN
output:
Sort Method: external merge Disk: 2928kB
tells us, that sorting is not done in RAM, which is expensive. You could probably improve performance by tuning the according setting for work_mem
work_mem (integer)
Specifies the amount of memory to be used by internal sort operations and hash tables before writing to temporary disk files. ...
Setting this too high may have adverse effects. Read the manual carefully. Consider increasing the setting only for the transaction with the big query:
BEGIN;
SET LOCAL work_mem = '50MB';
SELECT ...;
COMMIT;
50 MB are an estimate based on your EXPLAIN ANALYZE
output for 73k rows. Test with 1M rows to get the actual amount you need.
Index
A plain multicolumn B-tree index should work after all:
CREATE INDEX foo_idx
ON geoposition_records (equipment_id, created_at DESC NULLS LAST);
Why DESC NULLS LAST
?
It's safe to assume you have an equipment
table? Then performance won't be a problem:
Correlated subquery
Based on this equipment
table, run a lowly correlated subquery to great effect:
SELECT equipment_id
,(SELECT created_at
FROM geoposition_records
WHERE equipment_id = eq.equipment_id
ORDER BY created_at DESC NULLS LAST
LIMIT 1) AS latest
FROM equipment eq;
For a small number of rows in the equipment
table (57 judging from your EXPLAIN ANALYZE
output), that's very fast.
LATERAL
join in Postgres 9.3+
SELECT eq.equipment_id, r.latest
FROM equipment eq
LEFT JOIN LATERAL (
SELECT created_at
FROM geoposition_records
WHERE equipment_id = eq.equipment_id
ORDER BY created_at DESC NULLS LAST
LIMIT 1
) r(latest) ON true;
Detailed explanation:
Performance similar to the correlated subquery.
Function
If you can't talk sense into the query planner (which shouldn't occur), a function looping through the equipment table is certain to do the trick. Looking up one equipment_id
at a time uses the index.
CREATE OR REPLACE FUNCTION f_latest_equip()
RETURNS TABLE (equipment_id int, latest timestamp)
LANGUAGE plpgsql STABLE AS
$func$
BEGIN
FOR equipment_id IN
SELECT e.equipment_id FROM equipment e ORDER BY 1
LOOP
SELECT g.created_at
FROM geoposition_records g
WHERE g.equipment_id = f_latest_equip.equipment_id
-- prepend function name to disambiguate
ORDER BY g.created_at DESC NULLS LAST
LIMIT 1
INTO latest;
RETURN NEXT;
END LOOP;
END
$func$;
Makes for a nice call, too:
SELECT * FROM f_latest_equip();
Performance comparison:
db<>fiddle here
OLD sqlfiddle
Best Answer
Join to a subquery that computes
numero
with the window functionrow_number()
:Details for
UPDATE
syntax in the manual.If you have concurrent write access you need to lock the table to avoid race conditions.
Note that updating every row in a table is expensive either way. The table typically grows to twice its size and
VACUUM
orVACUUM FULL
may be in order.Depending on your complete situation it may be more efficient to write a new table to begin with. Related answers with instructions:
I am not convinced, though, that you need the column
numero
in your table at all. Maybe you are looking for aMATERIALIZED VIEW
. Recent related answer on SO: