PostgreSQL – How to Join External Data in SELECT Query

optimizationperformancepostgresqlpostgresql-performancetemporary-tables

I have few (~20) names of products corresponding to an integer id that needs to be joined on products while exporting products data (using COPY (<query>) TO). The burden is that these names do not rest in DB, rather in Redis (rails's translations fyi). Therefore, I'll be collecting these names before each export. Now, having these names outside the DB, how can I build the query to join them?

I've tried to join these names on-the-fly:

SELECT
...
LEFT JOIN (
    SELECT unnest('{1,2,3}'::int[]) AS id,
           unnest($${"name1", "name2", "name3"}$$::text[]) AS name
) AS product_names ON product_names.id = products.type_id
...

But I've got busted by the amount of products rows (~500k) and lacking indices for that structure. Is there some (better) way to join these data on-fly? I guess one cannot index these on-fly data though.

Which is a good way to solve that situation? I can think of creating temporary table like:

SELECT ... INTO TEMPORARY ...;

but is it wise to have that stay on for long transaction, while exporting (~30 mins)? Or can I omit transactions while using temporary tables and how? Or am I just troubling myself with needless problems while the best solution is to just store those names as a regular table, only truncating it before each export to have actual names?

— postgres 9.3.10

Best Answer

Assuming current Postgres 9.4.

Since you are dealing with just ~ 20 names of products, it's largely irrelevant how you provide those to the query. An index would be a waste of time, Postgres is going to use a sequential scan anyway (judging from what you disclosed in the question).

You can use a temporary table, which lives for the duration of the session, not just transaction like you seem to assume. And you can even create indexes on temp tables, it just wouldn't help in this case.
Be sure to run ANALYZE on temp tables manually:

Or you can unnest a provided array literal just as well. But don't rely on the outdated trick of unnesting two arrays in the SELECT list in parallel. This is error prone. Move it to the from clause. I'd suggest:

SELECT
...
LEFT JOIN unnest('{name1,name2,name3}'::text[]) WITH ORDINALITY
  product_names (name, type_id) USING (type_id)
...

More:

Assuming the column name type_id is unambiguous in the tables to the left.

If you don't actually have ascending type_id starting with 1 like your example suggests, consider the new variant of unnest() taking multiple array parameters:

SELECT
...
LEFT JOIN unnest('{4,7,9}'::int[], '{name1,name2,name3}'::text[])
  product_names (type_id, name) USING (type_id)
...

More: