PostgreSQL Dynamic SQL – How to Make a Table Name an Attribute

dynamic-sqlplpgsqlpostgresql

I have received hundreds of tables in PostgreSQL with identical structure and each containing similar data of a different day, so that the table name is the date of the data.

I would like to Union them all into a single table but without losing that very critical information that is the date of the data. I guess that I should somehow use information_schema.tables and make use of PL/pgsql which I don't know and therefore should learn, but to be honest this seems to be a big effort for a one-shot operation. So, my question would be:
Is there an easy way to achieve this or what would be the best solution?

Best Answer

If you want to have some magic which does this automatically, you can use a function that dynamically creates a select statement and returns a union of all tables:

create or replace function get_all()
  returns table (data_source text, id integer)
as
$body$
declare
   full_query text := '';
   table_rec  record;
   row_nr     integer := 1;
begin
   for table_rec in select table_name 
                    from information_schema.tables 
                    where table_schema = 'public'     -- adjust here for your schema
                      and table_name like 'table_%'   -- adjust here for your prefix
                    order by table_name
   loop
     if row_nr > 1 then  
        full_query := full_query || ' union all ';
     end if;
     full_query := full_query || ' select '||quote_literal(table_rec.table_name)::text||' as data_source, t.id FROM '||quote_ident(table_rec.table_name)::text||' as t';
     row_nr := row_nr + 1;
   end loop;
   return query execute full_query;
end;
$body$
language plpgsql;

The above function can then be used like this:

select *
from get_all();

And you'll get something like this:

data_source         id
table_2014-07_01    ..
table_2014-07_01    ..
.....
table_2014-07_02    ...
table_2014-07_02    ...

If you want a different value for the "data_source" column, just do some pattern matching/replacing inside the function.

Another option would be to change the function to (re)create a view (instead of directly returning the data). That could make retrieving a bit faster.

You can also (re)create a materialized view using dynamic SQL in order to make the retrieval faster (because the result is then present in a single "table" that can be indexed properly).