PostgreSQL: Corrupt primary key, inconsistent table

postgresqlprimary-key

While recovering from a cloud failure, I found that some tables on a PostgreSQL database are behaving strangely. These tables are indexed using a primary key, but a pg_dump yielded duplicate fields, failing a pg_restore on a backup server.

I have tried to REINDEX:

REINDEX INDEX rank_details_pkey;
ERROR:  could not create unique index "rank_details_pkey"
DETAIL:  Table contains duplicated values.

The index is defined as:

<table info here>
Indexes:
    "rank_details_pkey" PRIMARY KEY, btree (user_id)

And, oddly,

SELECT user_id, COUNT(*) FROM <table name> GROUP BY 1 HAVING COUNT(*) > 1;
 user_id | count 
---------+-------
(0 rows)

To conclude – I have duplicate values in my table which can not be found or cleared.

Any ideas how to fix this? This is a production server, so all fixes should be done without affecting service.

Best Answer

There are various ways this can happen in Oracle - I'm not sure about postgres, but I think I would call this an "integrity violation" rather than "corruption"

Perhaps you can do one of the things suggested here, ie set enable_indexscan = off or

begin;
drop index rank_details_pkey;
select user_id, count(*) from rank_details group by user_id having count(*) > 1;
rollback;

But "there are likely some locking issues with this, so be careful with it in production"

The idea is to force the query to scan the table rather than just the index (which does not have the duplicates). You may also, and more simply, be able to acheive the same by:

select user_id, f(<some other column>), count(*)
from rank_details
group by user_id, f(<some other column>)
having count(*) > 1

where f() returns a constant, which may trick the planner into a table scan.

Related Solutions

Update primary key

If your FOREIGN keys are defined with the ON UPDATE CASCADE option, then you need to do nothing more that update the parent table.

But it seems that the ON UPDATE options are not yet properly implemented in Django.

PostgreSQL – Primary Key Disappears from Test Table

For a table created like this:

CREATE TABLE public.delete_key_bigserial (id bigserial PRIMARY KEY NOT NULL);

... both my queries in the previous answer (as well as pgAdmin, psql or any other decent client) would find the PK constraint. If it's not there, you removed it somehow.
Note that my first query only returns the column if it is the PK and a serial type - which is the case for the example.

Another possible cause for the confusion: Maybe you have more than one table named delete_key_bigserial in your database? Table names are only unique inside a single schema. Test with:

SELECT * FROM pg_class WHERE relname = 'delete_key_bigserial';

To make your query unambiguous, schema-qualify the table name:

WHERE  a.attrelid = 'public.delete_key_bigserial'::regclass

There are ways to make the constraint "disappear" without leaving a DROP CONSTRAINT in your logs.

Drop and recreate the table.
Drop and recreate the schema or database.
(Temporarily) set log_statement or other relevant settings so the statement is not logged.
Manipulate the system catalogs directly (as superuser) Internally, the primary key is set with contype = 'p' in the table pg_constraint.
Edit the log files.
etc.

Best Answer

Related Solutions

Update primary key

PostgreSQL – Primary Key Disappears from Test Table

Related Question