Postgresql – How to implement a sequence for each foreign key value

postgresqlpostgresql-9.3sequence

In a PostgreSQL (9.3) database, I have a one-to-many relationship between Accounts and Customers, so within the Customers table, I have a globally-unique id for the primary key of each Customer, plus the corresponding Account id. But I would also like to assign each one an Account-specific id – just a number starting at 1 for each Account's first Customer and incrementing for each additional Customer – simply as an easy way for users in an Account to reference a Customer in the UI.

I can think of two possible solutions:

Use the max() aggregate function to obtain the last Customer id used for the Account and increment it
- Pros
  - Relatively simple (conceptually, at least)
- Cons
  - Can result in an id being reused if the newest Customer is deleted
  - Need to handle concurrent inserts calculating the same id for the new records
Use a sequence
- Pros
  - Specifically designed to handle incrementing the value
  - Remembers the last-used value, even if the corresponding record is deleted
- Cons
  - Requires a separate sequence to be created for each new Account somehow (and then be deleted if/when the Account is deleted), otherwise each Account would end up with non-contiguous values in its Customer id sequence

UPDATE

To clarify the last point, some gaps in the sequence of ids are always going to occur; as soon as a Customer is deleted, for example, but those are to be expected. Aside from that though, a user adding a couple of new Customers would generally expect them to have consecutive ids even if they were added days apart.

OTOH, using a single sequence for all Customers could almost certainly see regular, unexplained gaps introduced by the ids assigned to the Customers created for other Accounts in the meantime.

I'm not a Postgres expert, so I'm concerned about how robustly I could implement either of these, especially as I've struggled to find any examples to follow.

Is there a generally accepted way of doing this?

Best Answer

How about using a column in the table account called max_customer_id and updating that every time there's a new customer? That should be the simplest solution available that I can think of.

With this solution you have to take care of the concurrency. Take a look at the document PostgreSQL Concurrency issues:

If two concurrent transactions try to insert rows having the same key value, then the second one will block until the first one finishes. If the first transaction commits, the second one must abort because of the uniqueness constraint; but if the first one aborts the second one can proceed.

If you want to assign a serial number to each row in a table, you might try
INSERT INTO mytable (id, ...) VALUES( (SELECT MAX(id) + 1 FROM mytable), ...);
This will not work safely unless you take an explicit lock on the whole table, which will prevent concurrent insertions. (It’ll also be quite slow, because MAX scans the whole table in PostgreSQL.)

Alternatively you could place the value in a separate table. This table would contain the account_id and maximum account_specific_id in use for that account. That way you could handle the concurrency easier with UPDATEs. Same source:

A variant is to use a single-row table to hold the next ID number to assign:
SELECT next FROM mytable counter FOR UPDATE; UPDATE mytable counter SET next = $next + 1; INSERT INTO mytable (id, ...) VALUES($next, ...);
This works (as long as you use FOR UPDATE), but you still have the problem that only one insertion transaction can proceed at a time. The implicit write lock on the counter row is the bottleneck.
[The author goes on recommending sequences.]

The number 2 option with sequences is not however recommended in this case as new sequences are not meant to be routinely inserted in a production environment.

Related Solutions

Polling for new records in a table

Some of the implementation specifics depend on exactly what type of behaviour you need from this table (e.g. FIFO), but a typical design for a queue table uses DELETE with an OUTPUT clause:

DELETE TOP (n)
FROM QueueTable WITH (ROWLOCK, READPAST)
OUTPUT deleted.RecordID;

Or if ordered removal is required:

DELETE FROM
(
    SELECT TOP (n) RecordID
    FROM QueueTable WITH (ROWLOCK, READPAST)
    ORDER BY RecordID
) AS DF
OUTPUT deleted.RecordID;

If the records should remain in the table after processing, the table would have an extra columns to record the status (e.g. New, Processing, Processed) and probably the process id owning the row (if Processing or Processed). The de-queuing code is then modified to choose only records with the desired status. An alternative design is to DELETE rows and move them to a separate audit/archive table at the same time. As I mentioned, the specific solution depends on the precise requirements.

Postgresql – Nearest value for each foreign key

You can use window functions to achieve your goal. lag() and lead() are ones which can help you in a query like

SELECT lag(di_timestamp) OVER ordering, lead(di_timestamp) OVER ordering
  FROM data_item
 WHERE fk_fc_id IN (35246,35247)
WINDOW ordering AS (ORDER BY di_timestamp 
                    RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW);

This will return the previous and the next timestamps, if there is any.

Best Answer

Related Solutions

Polling for new records in a table

Postgresql – Nearest value for each foreign key

Related Question