Sql-server – Non-sequential Primary Key performance in PostgreSQL 9.3

performancepostgresqlprimary-keysql server

Due to requirement to support distributed environment, each of my clients has been assigned unique node id, it appends the node id with running number as primary key and stores at local database. E.g.

Client A node id = 200, first row in a table will have primary key
200,000

Client B node id = 100, first row in a table will have primary key
100,000

These records then replicate to centralized database. Since primary key at centralized database is not in sequence, will it cause any serious performance issue when data size getting bigger?

Possible sequence of inserting new data at centralized database:

200,000
100,000
100,001
200,001
300,000
100,002

This may cause a big performance in SQL Server with table is clustered along the PK. However, will this happens in PostgreSQL 9.3?

Notes:

  1. I can't use composite keys as it does not play well at my presentation layers.
  2. The 3 digits running number is just a simplified example, real
    running number will be much bigger and sufficient.

Best Answer

I have 3 solutions for you:

Direct reference of sequence and using concat

One possible solution is to reference the seqence in insert statement directly and prepend your node-id. A similar question including answer you can find here: https://stackoverflow.com/a/17925601/4206293

Using a UUID

Another possible solution is, if you don't need you node-id in the primary-key field, you can use the uuid-ossp extension which provides the type uuid and the functions to generate uuids: http://www.postgresql.org/docs/9.4/static/uuid-ossp.html

Use a trigger for these solutions

For both solutions: you can use a trigger to set the primary key.

Example:

-- table
CREATE TABLE test(
   id character varying(10) NOT NULL,
  "name" character varying,
   CONSTRAINT idx_pk PRIMARY KEY (id)
);

-- seqence
CREATE SEQUENCE test_seq
  INCREMENT 1
  MINVALUE 1
  MAXVALUE 9223372036854775807
  START 1
  CACHE 1;

-- Create Function
CREATE OR REPLACE FUNCTION insert_trigger() 
RETURNS TRIGGER    
AS $$
BEGIN
   NEW.id := '100' || nextval('test_seq')::TEXT;
   RETURN NEW;
END;
$$ LANGUAGE 'plpgsql';

-- add Trigger
CREATE TRIGGER insert_table_trigger
BEFORE INSERT ON pkTable
FOR EACH ROW EXECUTE PROCEDURE insert_trigger();

If you now insert a value in the table

INSERT INTO test (name) VALUES ('test text');

you get your primary key '1001'

Using MINVALUE and MAXVALUE

Another way is you use the MINVALUE and MAXVALUE of SEQUENCE to define an numeric space:

CREATE SEQUENCE node100_seq
  INCREMENT 1
  MINVALUE 100000000000
  MAXVALUE 100999999999
  START 100000000000
  CACHE 1;

CREATE SEQUENCE node200_seq
  INCREMENT 1
  MINVALUE 200000000000
  MAXVALUE 200999999999
  START 200000000000
  CACHE 1;