Postgresql – Best practices for generating unique multi-column keys for weak entities

auto-incrementdatabase-designpostgresqltrigger

How should one generate non-unique, non-natural identifiers for weak entities?

For example, if order_id is the primary key for an order table, and (order_id, item_number) is the primary key for an order_item table with a foreign key on order_id, how best to generate item_number?

A few of possibilities come to mind, but none seem ideal:

  1. Autoincrement item_number: the order_item entity is no longer weak, and the composite key is redundant.

  2. Use a trigger to search for the current max item_number for a given order_id, then increment: if a row is deleted this could lead to reassigning a PK to a different record – that doesn't seem like a good idea? (edit: this can also be done without using triggers as explained in joanolo's answer)

  3. Use a trigger to create a new sequence for every order_id, and somehow draw item_numbers from the appropriate sequence: this is functionally the desired behaviour, but seems like it would be a mess to implement. Is it even possible to reference a sequence by an order_id?

Edit – closely related (if not duplicate):

Best Answer

1. is the least error-prone, simplest and fastest.
Trigger solutions like in 2. or 3. are subject to subtle race conditions under concurrent write access.

Make item_number a serial column and also the PK for order_item in this case. Stick with the default values drawn from the underlying sequence and never update the column.

Create a multi-column index on (order_id, item_number) for performance of typical queries. (Might as well be UNIQUE, but does not have to be.) In a typical setup (order_id and item_number can both be plain integer), the multicolumn index happens to be just as small and fast as an index on just order_id:

(Like I commented:) Typically, the only important role of an item number is to be unique (and immutable). If you need a stable sort order among items, you might just rely on the serial value of item_number. Be aware that those numbers are not necessarily in order of transaction commits. It may be useful to add the transaction timestamp current_timestamp (or possibly statement_timestamp() or clock_timestamp()) to the row. Depends on requirements and access patterns.

You can add a VIEW for the human eye, with item-numbers per order_id starting from 1, dynamically generated with row_number(), ordered by above criteria. But operate with the unique, immutable item_number internally.