PostgreSQL 9.3 – Deleting Many Rows in Chunks Without Long Locks

deleteplpgsqlpostgresql

I am new to postgres,

I have a 20 million row table that is on a live server – I need to remove most of the rows but not all. I want to do this without impact to other read/write processes accessing this table (very frequently).

I have a way to delete in about 100-400K row chunks at a time. Between each delete, I want to make the query sleep – so that other operations can get a chance to access this table.

I have the code, but I believe in this version, it locks the table the entire time the query runs (with all the sleeps). How can I actually release the table while the process sleeps? Thank you!!

My code so far:

CREATE SEQUENCE tmp_sq increment by 1 minvalue 1 maxvalue 53 start with 1;

DO $$

DECLARE 
w_counter integer;

BEGIN
w_counter := (SELECT nextval('tmp_sq')); 

  while w_counter < 53 loop
    raise notice 'Value: %', w_counter ;

    w_counter := (SELECT nextval('tmp_sq')); 

-- this way of breaking up the delete into chunks works for my table because of dates.
    delete from table_a where date_part('week',my_date) = w_counter;

    raise notice '     Rows Were Deleted ';

    --sleep
    raise notice 'Sleeping Now.....';
    perform pg_sleep(60);
  end loop;

END $$;

Best Answer

The table itself is only locked in ROW EXCLUSIVE mode, which shouldn't prevent any normal operations on the table, only things like DROP, ALTER, and CREATE INDEX.

Each individual row that is being deleted will be locked for the duration. This should only block other processes if those other processes are trying to update the rows (or delete them themselves). Normally with bulk deletions you are deleting rows that no one else cares about, which means no one else will be trying to update them.

If you really need to unlock all the deleted rows and the table periodically, then you need to do each loop of the deletions in a separate transaction. There is no easy way to do this in a DO block because the entire block runs in a single transaction. So you should put your loop outside of PostgreSQL, like in bash, perl, python, etc.