Index
First and foremost, for your type of query this is the much better index:
CREATE INDEX de_tt_priceinfo_received_station_id_idx
ON public.de_tt_priceinfo (station_id, received); -- note the reversed order
Since the combination is supposed to be unique (I assume), I suggest a UNIQUE
constraint on (station_id, receved)
instead:
ALTER TABLE de_tt_priceinfo ADD CONSTRAINT de_tt_priceinfo_station_id_received
UNIQUE (station_id, received);
The index index_station_id
is mostly superseded and can probably be dropped now.
The index de_tt_priceinfo_received_station_id_idx
may still have its use.
Be sure to understand the logic behind all this:
Query
I would also consider the basic DISTINCT ON
query:
SELECT DISTINCT ON (station_id)
station_id, e5, e10, diesel, received
FROM de_tt_priceinfo
WHERE received <= '2014-09-25 08:45:12'::TIMESTAMPTZ
AND station_id = ANY ('{0C91A93A-a-b-c-d, 578C44BB-a-b-c-d, 6F2F48A8-a-b-c-d
, 9982BE74-a-b-c-d, A24C612B-a-b-c-d, BEC3EF55-a-b-c-d
, F5137488-a-b-c-d}'::varchar[])
ORDER BY station_id, received DESC;
But since you seem to have a lot of rows per station, that's not going to shine. Instead:
SELECT *
FROM (
VALUES
('0C91A93A-a-b-c-d'::varchar)
, ('578C44BB-a-b-c-d')
, ('6F2F48A8-a-b-c-d')
, ('9982BE74-a-b-c-d')
, ('A24C612B-a-b-c-d')
, ('BEC3EF55-a-b-c-d')
, ('F5137488-a-b-c-d')
) s(station_id)
LEFT JOIN LATERAL (
SELECT e5, e10, diesel, received
FROM de_tt_priceinfo
WHERE station_id = s.station_id
AND received <= '2014-09-25 08:45:12'::TIMESTAMPTZ
ORDER BY received DESC
LIMIT 1
) p ON TRUE
This one should be dynamite in combination with above UNIQUE
constraint (or an equivalent index).
Detailed explanation:
Table definition
For a table with millions of rows it pays to optimize storage while easily possible. Makes everything smaller and faster.
That's how I would design it:
CREATE TABLE station (
station_id serial PRIMARY KEY
, station text
, CHECK (length(station) < 61) -- ?? optional, you decide
);
CREATE TABLE priceinfo (
priceinfo_id serial PRIMARY KEY
, station_id integer NOT NULL REFERENCES station ON UPDATE CASCADE
, received timestamptz NOT NULL DEFAULT now(),
, e5 integer -- price in 0.1 Cent
, e10 integer -- price in 0.1 Cent
, diesel integer -- price in 0.1 Cent
, CONSTRAINT priceinfo_station_id_received UNIQUE (station_id, received)
);
CREATE INDEX priceinfo_received_idx ON public.priceinfo (received);
The row size in priceinfo
would be 60 bytes (24 heap tuple header + null bitmap; 32 bytes data; 4 bytes item identifier), as compared to 94 bytes (24 + 66 + 4) in your original table. That's assuming 16-character string like in your example. Everything will be ~ 36 % smaller (or more?) and considerably faster.
The crucial index on (station_id, received)
is down to 8 bytes of data per index tuple instead of 32 bytes or even much more (!) - each plus overhead. In addition, handling integer
numbers for station_id
is generally faster than text with a COLLATION
on top of it.
Details:
Query would fetch station_id
from station
table first, which is cheap.
Prices are stored as integer
numbers signifying 0.1 Cent. (4 bytes instead of 10 bytes for your original numeric(4,3)
Multiply with 0.1 to get Cent or 0.001 to get € for display. Very simple and fast.
UUID
The string in the error message looks considerably longer and actually like a regular UUID
number:
871828b4-37e5-419c-b7a5-cdbe1e1c0148
If so, use the uuid
data type. Whether you adopt my design of keep your old. At least switch to the uuid
data type for a big overall gain in every aspect:
Best Answer
If each of these 3000 connections used
work_mem
only once (this limit is per node of the query plan, so it can be used several times by one database session), that could use almost 1TB RAM. I am not surprised that you are going OOM.max_connections = 3000
is an insane setting. Do you have 3000 cores? Can your storage subsystem handle 3000 concurrent I/O requests?I recommend that you use pgBouncer or another connection pool to reduce the number of database sessions.
By the way
vm.overcommit_ratio
should be 100 if you have no swap, otherwise you cannot allocate more than 80% of the available RAM (but perhaps that's on purpose).