Postgresql – How to adopt the PostgreSQL naming convention in legacy database

identifierpostgresql

I realize I'll need to update the queries, but if I have a database that has camelCase or spaces in it, you have to use double quotes on the identifiers (viz. schema, table, column). How do I migrate away from taking into account that I can't have capital letters, nor spaces in my identifiers. I need to normalize all of them to snake_case.

Best Answer

Moving to PostgreSQL convention

In PostgreSQL by convention and with good reason, we neither use spaces nor capital letters in our identifiers (columns, tables, schemas, etc). The use of _ is purely style though. In this example, we

  1. Transliterate all ' ' to '_'
  2. Migrate from camelCase to snake_case

This code changes nothing, but outputs the respective ALTER commands to update the schema (including the schema, table, and column names)

-- Columns first.
SELECT FORMAT(
  'ALTER TABLE %I.%I.%I RENAME COLUMN %I TO %I;',
  table_catalog,
  table_schema,
  table_name,
  column_name,
  lower(
    -- replace all spaces with _, xX and Xx becomes x_x
    regexp_replace(
      -- First, replace spaces with an _
      replace(column_name, ' ', '_'),
      '([[:lower:]])([[:upper:]])',
      '\1_\2',
      'g'
    )
  )
)
FROM information_schema.columns
WHERE column_name ~ ' |[[:lower:]][[:upper:]]'

-- Tables
UNION ALL
  SELECT FORMAT (
    'ALTER TABLE %I.%I.%I RENAME TO %I;', 
    table_catalog,
    table_schema,
    table_name,
    lower(
      regexp_replace(
        replace(table_name, ' ', '_'),
        '([[:lower:]])([[:upper:]])',
        '\1_\2',
        'g'
      )
    )
  )
  FROM information_schema.tables
  WHERE table_name ~ ' |[[:lower:]][[:upper:]]'

-- Schemas
UNION ALL
  SELECT FORMAT (
    'ALTER SCHEMA %I RENAME TO %I;',
    schema_name,
    lower(
      regexp_replace(
        replace(schema_name, ' ', '_'),
        '([[:lower:]])([[:upper:]])',
        '\1_\2',
        'g'
      )
    )
  )
  FROM information_schema.schemata
  WHERE schema_name ~ ' |[[:lower:]][[:upper:]]'; 

From there you can edit the commands to be run, delete the ones you don't want, or simply run \gexec if you're using psql and they will execute.

Testing

If you wish you can test the above

CREATE SCHEMA "myFoo bar";
CREATE TABLE "myFoo bar"."foo bar myBaz" ( "my foo" int, "myBar" int, "MyBaz" text, "myFooBarBaz" int, "MyFoo Bar Baz" int, "myTestQ" uuid );

                                            format                                             
-----------------------------------------------------------------------------------------------
 ALTER TABLE test.my_foo_bar.foo_bar_my_baz RENAME COLUMN "MyFoo Bar Baz" TO my_foo_bar_baz;
 ALTER TABLE test."myFoo bar"."foo bar myBaz" RENAME COLUMN "my foo" TO my_foo;
 ALTER TABLE test."myFoo bar"."foo bar myBaz" RENAME COLUMN "myBar" TO my_bar;
 ALTER TABLE test."myFoo bar"."foo bar myBaz" RENAME COLUMN "MyBaz" TO my_baz;
 ALTER TABLE test."myFoo bar"."foo bar myBaz" RENAME COLUMN "myFooBarBaz" TO my_foo_bar_baz;
 ALTER TABLE test."myFoo bar"."foo bar myBaz" RENAME COLUMN "MyFoo Bar Baz" TO my_foo_bar_baz;
 ALTER TABLE test."myFoo bar"."foo bar myBaz" RENAME COLUMN "myTestQ" TO my_test_q;
 ALTER TABLE test."myFoo bar"."foo bar myBaz" RENAME TO foo_bar_my_baz;
 ALTER SCHEMA "myFoo bar" RENAME TO my_foo_bar;
(9 rows)

Note because there is a collision in the above to my_foo_bar_baz, you'll see

ERROR: column "my_foo_bar_baz" of relation "foo bar myBaz" already exists

That just confirms nothing catastrophic happens if you happen to run into that. highly unlikely