PostgreSQL 10 – Understanding Identity Columns

identitypostgresqlpostgresql-10

I was reviewing the commit-fest scheduled for 7/01 for PostgreSQL and I saw that Pg is likely going to get "identity columns" sometime soon.

I found some mention in information_schema.columns but nothing much

is_identity         yes_or_no         Applies to a feature not available in PostgreSQL
identity_generation character_data    Applies to a feature not available in PostgreSQL
identity_start      character_data    Applies to a feature not available in PostgreSQL
identity_increment  character_data    Applies to a feature not available in PostgreSQL
identity_maximum    character_data    Applies to a feature not available in PostgreSQL
identity_minimum    character_data    Applies to a feature not available in PostgreSQL
identity_cycle      yes_or_no         Applies to a feature not available in PostgreSQL

The Wikipedia Page doesn't say much either

An identity column differs from a primary key in that its values are managed by the server and usually cannot be modified. In many cases an identity column is used as a primary key; however, this is not always the case.

But, I don't see anything else on them. How do identity columns work? Do they provide any new functionality or is this just a standard method to create sequences? Any breakdown of the new feature and how it works?

Best Answer

This is to implement the feature found in the standard. (copied from a draft, date: 2011-12-21):

4.15.11 Identity columns

The columns of a base table BT can optionally include not more than one identity column. The declared type of an identity column is either an exact numeric type with scale 0 (zero), INTEGER for example, or a distinct type whose source type is an exact numeric type with scale 0 (zero). An identity column has a start value, an increment, a maximum value, a minimum value, and a cycle option. ...
... The definition of an identity column may specify GENERATED ALWAYS or GENERATED BY DEFAULT.

It is a property of a column which basically says that the values for the column will be provided by the DBMS and not by the user and in some specific manner and restrictions (increasing, decreasing, having max/min values, cycling if the max/min value is reached).

Sequence generators (usually called just "sequences") are a related SQL standard feature: it's a mechanism that provides such values - and can be used for identity columns.

Note the subtle difference: a SEQUENCE is an object that can be used to provide values for one or more identity columns or even at will.


The various DBMS have so far implemented similar features in different ways and syntax (MySQL: AUTO_INCREMENT, SQL Server: IDENTITY (seed, increment), PostgreSQL: serial using SEQUENCE, Oracle: using triggers, etc) and only recently added sequence generators (SQL Server in version 2012 and Oracle in 12c).

Up to now Postgres has implemented sequence generators (which can be used to provide values for column, either with the special macros serial and bigserial or with nextval() function) but has not yet implemented the syntax for identity columns, as it is in the standard.

Defining identity columns (and the slight difference from serial columns) and various syntax (eg. GENERATED ALWAYS, NEXT VALUE FOR, etc) from the SQL standard is what this feature is about. Some changes / improvements may need to be done on the implementation of sequences as well, as identity columns will be using sequences.

If you follow the link identitity columns (from the page you saw), you'll find:

identity columns

From: Peter Eisentraut
To: pgsql-hackers Subject: identity columns
Date: 2016-08-31 04:00:42
Message-ID: 6adbacbf-73bc-dd1a-2033-63409180fd18@2ndquadrant.com


Here is another attempt to implement identity columns. This is a standard-conforming variant of PostgreSQL's serial columns. It also fixes a few usability issues that serial columns have:

  • need to set permissions on sequence in addition to table (*)
  • CREATE TABLE / LIKE copies default but refers to same sequence
  • cannot add/drop serialness with ALTER TABLE
  • dropping default does not drop sequence
  • slight weirdnesses because serial is some kind of special macro

(*) Not actually implemented yet, because I wanted to make use of the NEXT VALUE FOR stuff I had previously posted, but I have more work to do there.

...

Update 2017, September: seems like the feature will be in Postgres 10, which is to be released in a few days/weeks: What's New In Postgres 10: Identity Columns


Oracle have also implemented identity columns and sequences, in version 12c. The syntax is according to the standard, as far as I checked:
Identity Columns in Oracle Database 12c Release 1 (12.1)

The 12c database introduces the ability define an identity clause against a table column defined using a numeric type. The syntax is show below.

GENERATED
[ ALWAYS | BY DEFAULT [ ON NULL ] ]
AS IDENTITY [ ( identity_options ) ]