Postgresql – Get all distinct values of specific column in all tables in a given schema (Postgres)

postgresql

I have a schema in which all tables have the same basic set of columns. Now I want to query a specific column in all existing tables dynamically to get the distinct list of values of that column. The result should be a list consisting of all "table_name" and "unique_value" combinations (of that queried column in that table).

The result should be this (with the given test set below):

table_name | query_me
-------------------------
sql_test_1 | a
sql_test_1 | b
sql_test_1 | c
sql_test_2 | a
sql_test_2 | f
sql_test_2 | g

Important: I do not want to hard-code the table names I will query, I really want to query "all existing tables (with a given name scheme)".

I unfortunately have absolutely no clue how to do this in (Postgre)SQL. Can somebody help?

test case

CREATE TABLE IF NOT EXISTS public.sql_test_1 (
    id serial PRIMARY KEY,
    query_me varchar(10) NOT NULL,
    val TEXT
);
CREATE TABLE IF NOT EXISTS public.sql_test_2 (
    id serial PRIMARY KEY,
    query_me varchar(10) NOT NULL,
    val TEXT
);

INSERT
    INTO
    public.sql_test_1 (query_me, val)
VALUES 
    ('a', 'hey'),
    ('b', 'ho'),
    ('c', 'silver'),
    ('a', 'let''s'),
    ('b', 'ride');

INSERT
    INTO
    public.sql_test_2 (query_me, val)
VALUES 
    ('a', 'hey'),
    ('f', 'ho'),
    ('g', 'silver'),
    ('a', 'let''s'),
    ('f', 'ride');

expected result see above.

Best Answer

You can use query_to_xml() to dynamically run a SQL statement and get the result as an XML string. Using XMLTABLE this can then be converted back into rows and columns

with queries as (    
  select format('select distinct %L as table_name, %I as col from %I.%I', c.table_name, c.column_name, c.table_schema, c.table_name) as sql
  from information_schema.columns c
  where c.table_schema = 'public'
    and c.column_name = 'query_me'
), results as (
  select query_to_xml(q.sql, true, false, '')  as data
  from queries q
)
select x.*
from results r
  cross join xmltable('/table/row' passing r.data
                       columns table_name text path 'table_name', 
                               value      text path 'col') as x
order by x.table_name

The first CTE returns something like this:

select distinct 'sql_test_1' as table_name, query_me as col from public.sql_test_1
select distinct 'sql_test_2' as table_name, query_me as col from public.sql_test_2

The second CTE then runs each of those queries and returns the result as XML. Something like this:

<table xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<row>
  <table_name>sql_test_1</table_name>
  <col>c</col>
</row>

<row>
  <table_name>sql_test_1</table_name>
  <col>a</col>
</row>

<row>
  <table_name>sql_test_1</table_name>
  <col>b</col>
</row>

</table>

And the final step then extracts the values from the generated XML and returns this based on your sample data:

table_name | value
-----------+------
sql_test_1 | c    
sql_test_1 | a    
sql_test_1 | b    
sql_test_2 | g    
sql_test_2 | f    
sql_test_2 | a    

Online example