Postgresql – delete hung soon after vacuum verbose analyse

I have a fairly large java webapp system that connects to a postgres8.4 db on rhel.
This app only does inserts and reads.

This has been running without issue for 2.5 years and for the last year or so I implemented a data deletion script which deletes up to 5000 parent records and the corresponding child data in 3 tables.

This deletion script is called from a cron job every 5 minutes. It has been working flawlessly for over a year. This script always takes about 1-3 seconds to complete.

Also called from cron is vacuum verbose analyse script which is just called once a day (a few minutes before a deletion). This also has been working flawlessly for over a year. This takes about 15 minutes to complete.

Now last weekend (Saturday), the vacuum kicked off at 14:03, the deletion kicked off at 14:07 and did not complete. The next deletion kicked off at 14:12 and encountered a deadlock. The webapp at 14:14 hung. At 14:16 everything resumed as per normal and has been running fine since.

Now

To add more confusion, this server has an almost identical setup (standby sever) however the vacumm cron is due to run at 02:03 in the morning. When the vacuum kicked off on Sunday at 02:03, the same situation as above was encountered and at 02:14 the java app hung resuming at 02:15.

More

To further confuse me:
every time I go to the site I takes reading of df – this is always around 40% but after this happened, the df now reports 5% less

Any ideas? Please let me know if I have left out any relevant information.

edit

This is the Postgresql-Sat.log

WARNING:  skipping "pg_authid" --- only superuser can vacuum it
WARNING:  skipping "pg_database" --- only superuser can vacuum it
WARNING:  skipping "pg_tablespace" --- only superuser can vacuum it
WARNING:  skipping "pg_pltemplate" --- only superuser can vacuum it
WARNING:  skipping "pg_shdepend" --- only superuser can vacuum it
WARNING:  skipping "pg_shdescription" --- only superuser can vacuum it
WARNING:  skipping "pg_auth_members" --- only superuser can vacuum it
ERROR:  deadlock detected
DETAIL:  Process 12729 waits for ShareLock on transaction 91311423; blocked by process 12800.
    Process 12800 waits for ShareLock on transaction 91311422; blocked by process 12729.
    Process 12729: delete from child1 where id in
    (
    select id from parent where date_collected < now() - interval '13 months' order by id limit 5000
    );
    Process 12800: delete from child1 where id in
    (
    select id from parent where date_collected < now() - interval '13 months' order by id limit 5000
    );

SELECT kl.pid as blocking_pid, ka.usename as blocking_user, ka.current_query as blocking_query, bl.pid as blocked_pid, a.usename as blocked_user, a.current_query as blocked_query, to_char(age(now(), a.query_start),'HH24h:MIm:SSs') as age FROM pg_catalog.pg_locks bl JOIN pg_catalog.pg_stat_activity a ON bl.pid = a.procpid JOIN pg_catalog.pg_locks kl ON bl.locktype = kl.locktype and bl.database is not distinct from kl.database and bl.relation is not distinct from kl.relation and bl.page is not distinct from kl.page and bl.tuple is not distinct from kl.tuple and bl.virtualxid is not distinct from kl.virtualxid and bl.transactionid is not distinct from kl.transactionid and bl.classid is not distinct from kl.classid and bl.objid is not distinct from kl.objid and bl.objsubid is not distinct from kl.objsubid and bl.pid kl.pid JOIN pg_catalog.pg_stat_activity ka ON kl.pid = ka.procpid WHERE kl.granted and not bl.granted ORDER BY a.query_start;

WARNING: skipping "pg_authid" --- only superuser can vacuum it WARNING: skipping "pg_database" --- only superuser can vacuum it WARNING: skipping "pg_tablespace" --- only superuser can vacuum it WARNING: skipping "pg_pltemplate" --- only superuser can vacuum it WARNING: skipping "pg_shdepend" --- only superuser can vacuum it WARNING: skipping "pg_shdescription" --- only superuser can vacuum it WARNING: skipping "pg_auth_members" --- only superuser can vacuum it

Best Answer

I'm not able to test this since I don't use postgresql 8.4 but i think you have to:

check which transaction(check Sql statement) is locked and which is locking.
to check it

Note:this is for postgresql v9.1 i am not sure this work for 8.4 or not.

You mentioned about dead lock so it mean your transaction are waiting each other until occur deadlock(no transaction completed(commit or rollback) so make sure the locking transaction was completed(process 12729) and then transaction (Process 12800) will not wait anymore.
Vacuum is occur table lock level too. when u run vacuum command it will locked the target tables to reclaims storage occupied by dead tuples so that why when you check the df - your free disk space is higher than before.
about Privileges

you need to use supper user(postgres) to make vacuum verbose analyse for above tables (pg_*) or you just specified the target table only, it no need to use supper user privileges.

ex:

vacuum verbose analyse [table_name];

Best Answer

to check it

Related Solutions

Postgresql – Disk file effects of delete and vacuum

Postgresql – pg_relation_size does not show any difference after VACUUM ANALYZE

Related Question