Postgresql – Multiple inserts across tables with serial primary keys

postgresql

I asked this question on stack overflow, but I'm thinking it's better on this stack. Apologies if I've got that wrong. The motivation for this question is to prepare some sql scripts I can run during unit tests and production to define a bunch of master data for a web-based application that I'm building.

I have three tables, which I've abbreviated into the following

create table top (top_id serial, top_name text);
create table mid (mid_id serial, top_id integer, mid_name text);
create table bot (bot_id serial, mid_id integer, bot_name text);

what I want to achieve is

  • insert a value into top, and capture it's ID
  • insert multiple values into mid, using the same top_id,
  • for each value in mid, insert multiple values into bot, using each mid_id.

So, the final result might look like

top
----
1, first_top
2, second_top

mid
---
1, 1, first_mid_for_first_top
2, 1, second_mid_for_first_top
3, 2, first_mid_for_second_top

bot
---
1, 1, first_bot_for_first_mid
2, 1, second_bot_for_first_mid
3, 3, first_bot_for_third_mid

My first approach was to try to have a bunch of nested CTEs, but I couldn't make that into valid syntax (and on consideration, I think that it's not the right approach).

My second approach was to try to store the values into some sort of variable, but I don't know how to do that outside of the pg command line tool.

What I'm after is something analogous to

top_id_one = insert into top(null, 'first_top') returning top_id
mid_id_one = insert into mid(null, top_id_one, 'first_mid_for_first_top) returning mid_id
insert into bot(null, mid_id_one, 'first_bot_for_first_mid')

in a "pure sql" that I could paste into PG admin for example.

I realise that there's a few problems in the example above (for example using 'top' as a table name). My explorations have avoided these issues, but I was trying to find table names that would make my example make sense.

Best Answer

The following is incorrect:

insert into top values (null, 'first_top')

You are explicitly passing null for the first column. Postgres will (try to) store that value in the column and the sequence (that the serial uses) will not be used! By supplying an explicit value for a serial column (or any column with a default) you are overwriting that automatic default.


Now for the actual question:

You can use lastval() to obtain the most recently generated sequence value:

insert into top (top_name) values ('first_top');
insert into mid (top_id, mid_name) values (lastval(), 'first_mid_for_first_top');
insert into bot (mid_id, bot_name) values (lastval(), 'first_bot_for_first_mid');

You can't however insert multiple rows into the mid table in a single statement with that.