PostgreSQL Query – How to Get Max Field Value Length for Each TEXT Field in All Tables

dynamic-sqlplpgsqlpostgresql

I am trying to get information about the data in my database. I have several columns in several tables that or type 'TEXT'. I want to perform a query that will create a result that looks like:

|table_name|column_name|max_length|
-----------------------------------
|my_table  |name       |35        |

I have the query to get all the tables and columns:

SELECT table_name, column_name FROM information_schema.columns
WHERE table_schema = 'public' AND data_type = 'text'
ORDER BY table_name

But I can't figure out how to now go through each of those tables and columns to get the max length of each column. I do have the query to get the max length of a specific column and table using:

SELECT 'my_table', 'name', MAX(LENGTH(name)) FROM my_table

How can I make a query that will get the max length dynamically based on the results of the query to get all tables and columns?

Best Answer

No need for PL/pgSQL.

You can use a variation of this solution to do everything in a single query.

select table_schema, 
       table_name,
       column_name, 
       (xpath('/row/max/text()',query_to_xml(format('select max(length(%I)) from %I.%I', column_name, table_schema, table_name), true, true, '')))[1]::text::int as max_length
from information_schema.columns
where table_schema = 'public'       
  and data_type = 'text'
order by table_schema, table_name, column_name;

Evan's comment got me thinking if there is a way to do this with a single statement per table (although I doubt that this performance optimization is really important for a "get me some information about my database query")

With a bit of XML juggling, it is indeed possible:

with counts as (
  select table_schema, 
         table_name,
         query_to_xml(format('select %s from %I.%I', string_agg(format('max(length(%I)) as %I', column_name, column_name), ', '), table_schema, table_name), 
                      true, true, '') as result
  from information_schema.columns
  where table_schema = 'public'       
    and data_type = 'text'
  group by table_schema, table_name
)
select table_schema, 
       table_name, 
       substring(x::text, 2, strpos(x::text,'>') - 2) as column_name,
       (xpath('./text()', x))[1]::text::int as max_length
from counts, unnest(xpath('//row/*', result)) as c(x);

For the pagila demo database this returns:

table_schema | table_name                 | column_name | max_length
-------------+----------------------------+-------------+-----------
public       | actor_info                 | film_info   |        830
public       | customer_list              | notes       |          6
public       | customer_list              | name        |         21
public       | film                       | description |        130
public       | film_list                  | actors      |        216
public       | film_list                  | description |        130
public       | nicer_but_slower_film_list | actors      |        201
public       | nicer_but_slower_film_list | description |        130
public       | sales_by_store             | manager     |         12
public       | sales_by_store             | store       |         19
public       | staff_list                 | name        |         12

The above first generates one statement per table containing one max(length(..)) for each column.

The result is then parsed and turned into rows again. It would be possible to aggregate that back into a single JSON.


To get one row per table, with the column information as a JSON value, the following can be used:

with counts as (
  select table_schema, 
         table_name,
         query_to_xml(format('select to_jsonb(t) as cols from (select %s from %I.%I) t', string_agg(format('max(length(%I)) as %I', column_name, column_name), ', '), table_schema, table_name), 
                      false, true, '') as result
  from information_schema.columns
  where table_schema = 'public'       
    and data_type = 'text'
  group by table_schema, table_name
)
select table_schema, 
       table_name, 
       (xpath('//row/cols/text()', result))[1]::text::jsonb as column_sizes
from counts;

For the pagila demo database this returns:

table_schema | table_name                 | column_sizes                       
-------------+----------------------------+------------------------------------
public       | actor_info                 | {"film_info": 830}                 
public       | customer_list              | {"name": 21, "notes": 6}           
public       | film                       | {"description": 130}               
public       | film_list                  | {"actors": 216, "description": 130}
public       | nicer_but_slower_film_list | {"actors": 201, "description": 130}
public       | sales_by_store             | {"store": 19, "manager": 12}       
public       | staff_list                 | {"name": 12}