You can accomplish what you're asking for by doing something like this.
I created a table and data to help better illustrate what I'm doing.
CREATE TABLE logs (id serial NOT NULL PRIMARY KEY, lognum int, flightnum int);
INSERT INTO logs (lognum, flightnum) VALUES (1,6);
INSERT INTO logs (lognum, flightnum) VALUES (1,7);
INSERT INTO logs (lognum, flightnum) VALUES (1,8);
INSERT INTO logs (lognum, flightnum) VALUES (2,80);
INSERT INTO logs (lognum, flightnum) VALUES (3,12);
INSERT INTO logs (lognum, flightnum) VALUES (4,8008);
postgres@[local]:5432:postgres:=# SELECT * FROM logs;
id | lognum | flightnum
----+--------+-----------
13 | 1 | 6
14 | 1 | 7
15 | 1 | 8
16 | 2 | 80
17 | 3 | 12
18 | 4 | 8008
(6 rows)
Time: 0.188 ms
postgres@[local]:5432:postgres:=#
Note that 1, 2, and 4 have flights with 8, 80, and 8008.
Now, using this query, find the lognums
with 8 as a flightnum
. This takes your original query with an array_agg
, wraps another query around it to generates subscripts for each member in the array, which can be arbitrarily large. Finally, an outer query wraps that, which uses the generated subscript to allow you to do a comparison against each of the members of the flightnums
array to see if they are LIKE '8%'
.
SELECT lognum FROM (
SELECT lognum,
flightnums,
generate_subscripts(flightnums, 1) AS s
FROM (
SELECT lognum,
array_agg(flightnum) AS flightnums
FROM logs
GROUP BY lognum
) AS t1
) AS t2
WHERE flightnums[s]::text LIKE '8%' ORDER BY lognum;
Which gives you the following output
postgres@[local]:5432:postgres:=# SELECT lognum FROM (SELECT lognum, flightnums, generate_subscripts(flightnums, 1) AS s FROM ( SELECT lognum, array_agg(flightnum) AS flightnums FROM logs GROUP BY lognum) AS t1) AS t2 WHERE flightnums[s]::text LIKE '8%' ORDER BY lognum;
lognum
--------
1
2
4
(3 rows)
Time: 0.338 ms
postgres@[local]:5432:postgres:=#
As would be expected from the data above.
For further array manipulation needs, I would recommend reading up on their chapter in the PostgreSQL documentation arrays
The problem with your query is the join condition id = ANY(ancestors)
. Not only does it not preserve original order, it also eliminates duplicate elements in the array. (An id
could match 10 elements in ancestors
, it would still be picked once only.) Not sure if the logic of your query would allow duplicate elements, but if it does I am pretty sure you want to preserve all instances - you want to keep "original order" after all.
Assuming current Postgres 9.4+ for lack of information, I suggest a completely different approach:
SELECT n.entity_id, p.ancestors
FROM tree t
JOIN nodes n ON n.id = t.node_id
LEFT JOIN LATERAL (
SELECT ARRAY (
SELECT p.entity_id
FROM unnest(t.ancestors) WITH ORDINALITY a(id, ord)
JOIN entity.nodes p USING (id)
ORDER BY ord
) AS ancestors
) p ON true;
You query only works as intended if nodes.id
is defined as primary key and nodes.entity_id
is unique as well. Information is missing in the question.
Normally, this simpler query without explicit ORDER BY
works as well, but there are no guarantees (Postgres 9.3+)...
SELECT n.entity_id, p.ancestors
FROM tree t
JOIN nodes n ON n.id = t.node_id
LEFT JOIN LATERAL (
SELECT ARRAY (
SELECT p.entity_id
FROM unnest(t.ancestors) id
JOIN entity.nodes p USING (id)
) AS ancestors
) p ON true;
You can make this safe as well. Detailed explanation:
SQL Fiddle demo for Postgres 9.3.
Opional optimization
You join to entity.nodes
twice - to substitute for node_id
and ancestors
alike. An alternative would be to fold both into one array or one set and join only once. Might be faster, but you have to test.
For these alternatives we need the ORDER BY
in any case:
Add node_id
to the ancestors
array before we unnest ...
SELECT p.arr[1] AS entity_id, p.arr[2:2147483647] AS ancestors
FROM tree t
LEFT JOIN LATERAL (
SELECT ARRAY (
SELECT p.entity_id
FROM unnest(t.node_id || t.ancestors) WITH ORDINALITY a(id, ord)
JOIN entity.nodes p USING (id)
ORDER BY ord
) AS arr
) p ON true;
Or add node_id
to the unnested elements of ancestors
before we join ...
SELECT p.arr[1] AS entity_id, p.arr[2:2147483647] AS ancestors
FROM tree t
LEFT JOIN LATERAL (
SELECT ARRAY (
SELECT p.entity_id
FROM (
SELECT t.node_id AS id, 0 AS ord
UNION ALL
SELECT * FROM unnest(t.ancestors) WITH ORDINALITY
) x
JOIN entity.nodes p USING (id)
ORDER BY ord
) AS arr
) p ON true;
You did not show our CTE, this might be optimized further ...
Best Answer
Keep the basic query you already have to cheaply identify rows that have any relevant array elements using an index.
Then unnest only the intersection (
tag_ids & '{1,2,4,11}'
) in aLATERAL
join. Finally, aggregate:Once again, the intersection operator
&
of the intarray module is instrumental.Previous related answer:
Without intarray
If you don't have
intarray
installed, or for any other array type, we need another join instead:Subtle difference: the intersection operator folds duplicates to produce distinct elements, while this query does not. Only matters if there can be duplicate array elements ...
SQL Fiddle demonstrating both and with an additional row that has a mix of matching and not matching elements to demonstrate the need of the intersection or additional join to eliminate unwanted elements.