PostgreSQL – Estimating End Time for a Long Update

postgresql

I have currently running a long update (~million rows based on a postgis st_within condition) that was expected to finish in an hour or so based on similar previous queries. However, it's been three hours already; system monitoring shows that it's pegged a single CPU at 100% for that time.

Is there any way to view the progress of this update, to estimate when it is expected to end?

Best Answer

I know this question is old but I monitore all my updates in postgresql including updates inside transactions.. So I decided to help!

1- First create a sequence named myprogress for example.

create sequence myprogress; 

2- Make your update to hit the sequence - now you are free to start a transaction:

For example if you have this update:

update mytable set c=3 where a=0 and b=1; 

Just rewrite it to (plus this and condition):

update mytable set c=3 where ( a=0 and b=1 ) and  NEXTVAL('myprogress')!=0;

**You are ok to run this update inside a transaction, as sequences are affected globally. **

Note that this will not significantly affect performance.

3- Monitore your progress. Now you can just connect in another session (or a transaction) and select your sequence (sequences are affected globally , so you will see the value in the another sessions):

select last_value from myprogress; 

And you will see how many lines are affected by your query until now, with that you can estimate how many lines by second are hit by your update.

And in most of the cases how many time you will need to wait...

4- At end just restart the sequence:

alter sequence myprogress restart with 1; 

So you can use it again - But carefully, to do not trust this if two users use the same sequence. If you are in doubt better if you have your own progress sequence with permissions only for you.

You can use it for slow SELECTs, DELETEs, and for some INSERTs too !