PostgreSQL Python – Algorithm for Populating Tables

postgresqlpython

I'm writing an algorithm to populate all tables in a database in Postgres. I already get all tables that have no relationship populated, as follows:

  1. I get all the tables with the following query:

    SELECT table_name
       FROM information_schema.tables
    WHERE table_type = 'BASE TABLE' AND
       table_schema NOT IN ('pg_catalog', 'information_schema')
    
  2. For the list of tables obtained, I make a loop, and for each table I get its properties (column name, data type, accepts null or not, maximum characters):

    SELECT COLUMN_NAME, DATA_TYPE,
       IS_NULLABLE, CHARACTER_MAXIMUM_LENGTH
    FROM INFORMATION_SCHEMA.COLUMNS
       WHERE TABLE_NAME = 'TABLE_NAME'
    
  3. With this list of properties I dynamically create the INSERT

    INSERT INTO TABLE_NAME(prop1, prop2, prop3) VALUES(value1, value2, value3)
    

Each value is randomly generated based on the property type, eg:

  • Integer: 65422
  • Character: "Lorem ipsum…"
  • Date: 2016-12-12 20:00

And so on for each data type accepted by the algorithm. If it finds some kind of data that is not accepted, the application terminates (which is expected).

  1. And finally, with INSERT mounted I execute the query in the database

The steps in this algorithm work perfectly for tables where there are no relationships.


For tables that have relationships (1:N, 1:1, N:N), I would need to first find all the foreign keys and understand which tables they see, to enter the data in this table "Father" save the IDs and then insert In the "Daughters" tables thus associating their foreign keys. But how to do it in a simple way?

It is interesting to note that I can already get the list of all foreign keys through the following query:

SELECT conrelid::regclass AS table_from
      ,conname
      ,pg_get_constraintdef(c.oid)
FROM   pg_constraint c
JOIN   pg_namespace n ON n.oid = c.connamespace
WHERE  contype IN ('f', 'p ')
AND    n.nspname = 'public' -- your schema here
ORDER  BY conrelid::regclass::text, contype DESC;

I am grateful for suggestion.

Note: The algorithm is being developed in Python.

Best Answer

You can use a query like below to get all tables with dependencies;

WITH fkeys AS (
SELECT

  c.conrelid AS table_id,
  c_fromtablens.nspname AS schemaname,
  c_fromtable.relname AS tablename,

  c.confrelid AS parent_id,
  c_totablens.nspname AS parent_schemaname,
  c_totable.relname AS parent_tablename

FROM pg_constraint c
JOIN pg_namespace n ON n.oid = c.connamespace

JOIN pg_class c_fromtable ON c_fromtable.oid = c.conrelid
JOIN pg_namespace c_fromtablens ON c_fromtablens.oid = c_fromtable.relnamespace

JOIN pg_class c_totable ON c_totable.oid = c.confrelid
JOIN pg_namespace c_totablens ON c_totablens.oid = c_totable.relnamespace
WHERE
  c.contype = 'f'
)

SELECT
  t.schemaname,
  t.tablename,
  fkeys.parent_schemaname,
  fkeys.parent_tablename

FROM pg_tables t
LEFT JOIN fkeys ON  t.schemaname = fkeys.schemaname AND 
                    t.tablename =  fkeys.tablename 
WHERE
  t.schemaname NOT IN ('pg_catalog', 'information_schema')

ORDER BY
  3 NULLS FIRST,
  4 NULLS FIRST

Moreover, the query below provides you foreign key details. It provides relation types by using primary key definition. Note that it does not check unique constrains.

WITH fkey AS (

SELECT
  n.nspname AS fkey_schema_name,
  c.conname AS fkey_name,

  c_fromtablens.nspname AS table_schema_name,
  c_fromtable.relname AS table_name,

  c_totablens.nspname AS foreign_table_schema_name,
  c_totable.relname AS foreign_table_name,

  c.conrelid,
  unnest(c.conkey) AS fkey_field_num,

  c.confrelid,
  unnest(c.confkey) AS fkey_foreign_field_num

FROM pg_constraint c
JOIN pg_namespace n ON n.oid = c.connamespace

JOIN pg_class c_fromtable ON c_fromtable.oid = c.conrelid
JOIN pg_namespace c_fromtablens ON c_fromtablens.oid = c_fromtable.relnamespace

JOIN pg_class c_totable ON c_totable.oid = c.confrelid
JOIN pg_namespace c_totablens ON c_totablens.oid = c_totable.relnamespace
WHERE
  c.contype = 'f'

),

pkey AS (

SELECT
  c.conrelid,
  unnest(c.conkey) AS conkey
FROM pg_constraint c
WHERE
  c.contype = 'p'
)

SELECT 
  fkey.fkey_schema_name,
  fkey.fkey_name,
  fkey.table_schema_name,
  fkey.table_name,
  fkey.foreign_table_schema_name,
  fkey.foreign_table_name,
  a.attname AS field_name,
  a_f.attname AS foreign_field_name,
  CASE
    WHEN pkey.conrelid IS NULL AND pkeyf.conrelid IS NULL THEN 'N-N'
    WHEN pkey.conrelid IS NOT NULL AND pkeyf.conrelid IS NOT NULL THEN '1-1'
    WHEN pkey.conrelid IS NULL AND pkeyf.conrelid IS NOT NULL THEN 'N-1'
  END AS relation_type
FROM fkey
JOIN pg_attribute a ON a.attrelid = fkey.conrelid AND a.attnum = fkey.fkey_field_num
JOIN pg_attribute a_f ON a_f.attrelid = fkey.confrelid AND a_f.attnum = fkey.fkey_foreign_field_num
LEFT JOIN pkey ON pkey.conrelid = a.attrelid AND pkey.conkey = a.attnum
LEFT JOIN pkey pkeyf ON pkeyf.conrelid = a_f.attrelid AND pkeyf.conkey = a_f.attnum