Postgresql – make postgres more verbosely tell me what it’s doing

alter-tablelockingpostgresqlpostgresql-9.3

I'm trying to use ALTER TABLE on a live db (not in production, just poking around at a dev box and trying to learn some things)

Here is the command I'm trying:

psql -h $IP -p 5432 -U $DBUSER -d $DBNAME \
    -c "alter table request alter column sources set data type text;"

And what happens is… nothing. postgres just sits there silently for many, many minutes, until i give up and cancel it:

^CCancel request sent
ERROR:  canceling statement due to user request

Some reading I've done suggested that this should be fast because it's not adding any onerous constraints like UNIQUE or foreign keys or anything.

There are less than 100 rows in the table so even if this was a heavy command to run, it should still finish relatively quickly. The existing type is varchar(1000), is it just that it's difficult and slow for pg to convert from varchar to text? I would think that would be a trivial conversion, essentially just taking off the 1000 char limit.

Is it possible that the table is somehow locked and my command is just paused, waiting for something else to finish? How would I check that? If so, how would I say "no pg, this one is the important one, lock out everything else".

Generally it'd just be nice if I could get pg to print out some debugging to say whether or not the long pause is due to heavy processing or just waiting around for a lock to free.

There are 29 rows in pg_locks and some of them do say ExclusiveLock but beyond that I'm not really sure how to interpret this information.

Postgresql version is 9.3.9-0ubuntu0.14.04 (that is, the latest version available for Ubuntu Trusty).

Here's the block report: http://paste.ubuntu.com/12243226/

Potentially relevant: my db consumer is flask/sqlalchemy, perhaps it is the one defining the select statements as blocking the table. The fact that 9 statements are blocking and they all say "where request_id = 1" is quite strange because request_id 1 was deleted some time ago. Is this some kind of dangling ghost query? How can I clear this out?

Here's pg_stat_activity: http://paste.ubuntu.com/12243295/

Best Answer

The idle in transaction state shown in the pg_stat_activity output generally means that there's something that starts a transaction with BEGIN, does some work, and then doesn't issue the COMMIT so that the transaction is finished. This has the effect of leaving the ACCESS SHARE locks acquired by the SELECT statements hanging around, which block the ALTER TABLE statement you're trying to run from acquiring the ACCESS EXCLUSIVE lock it needs to change the type of a column for every row in the table.

They could only be considered "ghost queries", in the sense that they are hanging around waiting for an answer of COMMIT or ROLLBACK that will probably never come. But they are problematic, nonetheless as you figured out.

There might be a bug somewhere in your application that has left all these connections open and waiting for a commit. I'd definitely check that out, and see if perhaps a code block doesn't have a terminating COMMIT.

More commonly (in my experience, YMMV of course), people start database sessions with a BEGIN, do some work, and then go to lunch and never close their session.

It's best to fix the problem at the source, but if you absolutely, positively need to do something right now, pg_cancel_backend and pg_terminate_backend are not bad ways to go.

Commenting on your answer, I'd specifically select the state of idle in transaction, rather than LIKE '%idle%' because those are the ones causing your problems. The idle state backends are just waiting around patiently for new client commands, and had last ran a ROLLBACK command.

You can look at the Explicit Locking documentation for more information about what blocks what.

The description of idle-in-transaction is in the Unix Monitoring Tools section for reference.

pg_cancel_backend and pg_stat_activity issue SIGINT and SIGTERM signals, respectively, and the effects of those signals (and warnings about other signals) are documented in the Server Shutdown section.

Hope that helps explain what you saw and sheds some light on your mystery. =)