Postgresql – How to make unique check without creating heavy index in postgres

index-tuningmany-to-manypostgresql

I need to implement unique check for INSERT and UPDATE operations, but I would prefer to avoid creating heavy unique index (which is approx 12Gb now) on my table. Now I have unique partial index and it work great, except one thing – it need a lot of SSD space. I don't need SELECT operations using this index, I need just check for the uniqueness of the data.

I read this and this discussion and I understood that it's no real difference between unique constraint and unique index, except partial condition, which is able only for index.

Is there way to do this?

UPDATE:

My example is the schema of many-to-many relation table with history option. This option is implemented by 2 additional fields time_from and time_to. They are storing the time of entering and leaving relation. For data consistency I created 4 additional partial unique indexes (see below).

Now this table contains 1 162 010 000 rows. And overall space of table with each indexes:

vkontakte_wall_post_likes_users - 57 GB
vkontakte_wall_post_like_users_post_id - 24 GB
vkontakte_wall_post_like_users_time_to_2col_uniq - 24 GB
vkontakte_wall_post_like_users_pkey - 24 GB
vkontakte_wall_post_like_users_user_id - 24 GB
vkontakte_wall_post_like_users_time_to_3col_uniq - 846 MB
vkontakte_wall_post_like_users_time_from_2col_uniq - 8192 bytes
vkontakte_wall_post_like_users_id_seq - 8192 bytes

The schema:

CREATE TABLE vkontakte_wall_post_likes_users
(
  id integer NOT NULL DEFAULT nextval('vkontakte_wall_post_like_users_id_seq'::regclass),
  post_id integer NOT NULL,
  user_id integer NOT NULL,
  time_from timestamp with time zone,
  time_to timestamp with time zone,
  CONSTRAINT vkontakte_wall_post_like_users_pkey PRIMARY KEY (id),
  CONSTRAINT post_id_refs_id_3979681bdf0b31a3 FOREIGN KEY (post_id)
      REFERENCES vkontakte_wall_post (id) MATCH SIMPLE
      ON UPDATE NO ACTION ON DELETE NO ACTION DEFERRABLE INITIALLY DEFERRED,
  CONSTRAINT user_id_refs_id_73bdbaad8e08aee5 FOREIGN KEY (user_id)
      REFERENCES vkontakte_users_user (remote_id) MATCH SIMPLE
      ON UPDATE NO ACTION ON DELETE NO ACTION DEFERRABLE INITIALLY DEFERRED
)
WITH (
  OIDS=FALSE
);
ALTER TABLE vkontakte_wall_post_likes_users
  OWNER TO manufacture;

CREATE INDEX vkontakte_wall_post_like_users_post_id
  ON vkontakte_wall_post_likes_users
  USING btree
  (post_id);

CREATE UNIQUE INDEX vkontakte_wall_post_like_users_time_from_2col_uniq
  ON vkontakte_wall_post_likes_users
  USING btree
  (post_id, user_id)
  WHERE time_from IS NULL;

CREATE UNIQUE INDEX vkontakte_wall_post_likes_users_time_from_3col_uniq
  ON vkontakte_wall_post_likes_users 
  USING btree 
  (post_id, user_id, time_from)
  WHERE time_from IS NOT NULL;

CREATE UNIQUE INDEX vkontakte_wall_post_like_users_time_to_2col_uniq
  ON vkontakte_wall_post_likes_users
  USING btree
  (post_id, user_id)
  WHERE time_to IS NULL;

CREATE UNIQUE INDEX vkontakte_wall_post_like_users_time_to_3col_uniq
  ON vkontakte_wall_post_likes_users
  USING btree
  (post_id, user_id, time_to)
  WHERE time_to IS NOT NULL;

CREATE INDEX vkontakte_wall_post_like_users_user_id
  ON vkontakte_wall_post_likes_users
  USING btree
  (user_id);

Best Answer

There's no way to avoid creating a unique index if you want to efficiently ensure uniqueness for the data. The unique index is necessary for PostgreSQL to enforce uniqueness in the face of concurrent inserts, updates, and deletes.

A unique index is what backs a unique constraint; when you create a unique constraint, a unique index is created for you automatically.

The only other way to enforce uniqueness is to LOCK TABLE mytable IN EXCLUSIVE MODE before any change to the table that might affect the column you wish to make unique. That way you don't have to create a unique index, but in exchange you can only have one transaction changing the table at a time. You could do this from a trigger only when it detects a change in the unique column, but then you'd suffer from lock upgrades that would lead to frequent deadlocks.

Really, a unique index is the way to go here.

You can mitigate the impact of the required lock by using CREATE UNIQUE INDEX ... CONCURRENTLY then using the version of ALTER TABLE ... ADD CONSTRAINT ... that specifies an already-created index to use for the UNIQUE constraint.