Unfortunately = ANY (array)
only works with an array literal on the right hand side, not a sub-select.
You need to "normalize" your de-normalized model, using unnest()
:
SELECT latitude, longitude
FROM userloc
WHERE id IN (SELECT unnest(interested)
FROM donedeals
WHERE deals_id = 64);
If deals_id
is unique in the donedeals
table, another option is to "convert" the id
on the left side to an array and then use the "is contained by" operator: <@
:
SELECT latitude, longitude
FROM userloc
WHERE array[id] <@ (SELECT interested
FROM donedeals
WHERE deals_id=64 );
Not sure which one would be faster. You will need to check the execution plan.
Aaron,
In my recent work, I've been looking into some similar questions with PostgreSQL. PostgreSQL is almost always pretty good at generating the right query plan, but it isn't always perfect.
Some simple suggestions would be to make sure to run an ANALYZE
on your progresses
table to make sure that you have updated statistics, but this isn't guaranteed to fix your problems!
For reasons that are probably too long-winded for this post, I've found some odd behaviors in the statistics gathering of ANALYZE
and the query planner that may need to be resolved in the long-term. In the short-term the trick is to rewrite your query to try and hack out the query plan you want.
Without having access to your data for testing, I'll make the following two possible suggestions.
1) Use ARRAY()
PostgreSQL treats arrays and sets of records differently in its query planner. Sometimes you'll end up with an identical query plan. In this case, as in many of my cases, you don't.
In your original query you had:
EXPLAIN SELECT "v_latest_progresses".* FROM "v_latest_progresses"
WHERE "v_latest_progresses"."user_id"
IN (SELECT "users"."id" FROM "users" WHERE "users"."company_id"=44);
As a first pass at trying to fix it, try
EXPLAIN SELECT "v_latest_progresses".* FROM "v_latest_progresses"
WHERE "v_latest_progresses"."user_id" =
ANY(ARRAY(SELECT "users"."id" FROM "users" WHERE "users"."company_id"=44));
Note the change of the subquery from IN
to =ANY(ARRAY())
.
2) Use CTEs
Another trick is to force separate optimizations, if my first suggestion doesn't work. I know many people use this trick, because queries within a CTE are optimized and materialized separate from the main query.
EXPLAIN
WITH user_selection AS(
SELECT "users"."id" FROM "users" WHERE "users"."company_id"=44
)
SELECT "v_latest_progresses".* FROM "v_latest_progresses"
WHERE "v_latest_progresses"."user_id" =
ANY(ARRAY(SELECT "id" FROM user_selection));
Essentially, by creating the CTE user_selection
using the WITH
clause, you are asking PostgreSQL to perform a separate optimization on the subquery
SELECT "users"."id" FROM "users" WHERE "users"."company_id"=44
and then materializing those results. I then, once again use the =ANY(ARRAY())
expression to try to manually manipulate the plan.
In these cases, you probably can't trust just the results of EXPLAIN
, because it already thought that it found the least costly solution. Make sure to run an EXPLAIN (ANALYZE,BUFFERS)...
to find out what it really costs in terms of time and page reads.
Best Answer
This is not possible by a single query, because queries cannot span multiple databases, and the list of tables is only contained within the individual databases.