You can accomplish what you're asking for by doing something like this.
I created a table and data to help better illustrate what I'm doing.
CREATE TABLE logs (id serial NOT NULL PRIMARY KEY, lognum int, flightnum int);
INSERT INTO logs (lognum, flightnum) VALUES (1,6);
INSERT INTO logs (lognum, flightnum) VALUES (1,7);
INSERT INTO logs (lognum, flightnum) VALUES (1,8);
INSERT INTO logs (lognum, flightnum) VALUES (2,80);
INSERT INTO logs (lognum, flightnum) VALUES (3,12);
INSERT INTO logs (lognum, flightnum) VALUES (4,8008);
postgres@[local]:5432:postgres:=# SELECT * FROM logs;
id | lognum | flightnum
----+--------+-----------
13 | 1 | 6
14 | 1 | 7
15 | 1 | 8
16 | 2 | 80
17 | 3 | 12
18 | 4 | 8008
(6 rows)
Time: 0.188 ms
postgres@[local]:5432:postgres:=#
Note that 1, 2, and 4 have flights with 8, 80, and 8008.
Now, using this query, find the lognums
with 8 as a flightnum
. This takes your original query with an array_agg
, wraps another query around it to generates subscripts for each member in the array, which can be arbitrarily large. Finally, an outer query wraps that, which uses the generated subscript to allow you to do a comparison against each of the members of the flightnums
array to see if they are LIKE '8%'
.
SELECT lognum FROM (
SELECT lognum,
flightnums,
generate_subscripts(flightnums, 1) AS s
FROM (
SELECT lognum,
array_agg(flightnum) AS flightnums
FROM logs
GROUP BY lognum
) AS t1
) AS t2
WHERE flightnums[s]::text LIKE '8%' ORDER BY lognum;
Which gives you the following output
postgres@[local]:5432:postgres:=# SELECT lognum FROM (SELECT lognum, flightnums, generate_subscripts(flightnums, 1) AS s FROM ( SELECT lognum, array_agg(flightnum) AS flightnums FROM logs GROUP BY lognum) AS t1) AS t2 WHERE flightnums[s]::text LIKE '8%' ORDER BY lognum;
lognum
--------
1
2
4
(3 rows)
Time: 0.338 ms
postgres@[local]:5432:postgres:=#
As would be expected from the data above.
For further array manipulation needs, I would recommend reading up on their chapter in the PostgreSQL documentation arrays
Is there a better way of doing this?
Yes, use a temp table. There is nothing wrong with creating an indexed temp table when your query is that insane.
BEGIN;
CREATE TEMP TABLE myitems ( item_id int PRIMARY KEY );
INSERT INTO myitems(item_id) VALUES (1), (2); -- and on and on
CREATE INDEX ON myitems(item_id);
COMMIT;
ANALYZE myitems;
SELECT item_id, other_stuff, ...
FROM (
SELECT
-- Partitioned row number as we only want N rows per id
ROW_NUMBER() OVER (PARTITION BY item_id ORDER BY start_date) AS r,
item_id, other_stuff, ...
FROM mytable
INNER JOIN myitems USING (item_id)
WHERE end_date > $2
ORDER BY item_id ASC, start_date ASC, allowed ASC
) x
WHERE x.r <= 12;
But even better than that...
"500k different item_id" ... "int array can contain up to 15,000 elements"
You're selecting 3% of your database individually. I have to wonder if you're not better off creating groups/tags etc in the schema itself. I have never personally had to send 15,000 different IDs into a query.
Best Answer
Assuming the element to check for is never
NULL
, your original methoddelivers superior performance in the presence of a matching index for the array column (
int[]
in your example). See:If it's all about
integer
arrays, consider the additional moduleintarray
for superior performance.If your column actually is the array element in the expression (plain
integer
in your example), consider:OTOH, if
NULL
values can be involved on either side of the expression and you don't wantNULL
forNULL
input, rather treatNULL
like any other element, then usearray_position()
(Postgres 9.5 or later) like this:Related:
For tests without index support and no
NULL
values involved (or if you are happy withNULL
onNULL
input) and performance is not important, use the genericANY
construct like VĂ©race demonstrates.