There are a couple of misunderstandings here:
The null bitmap is not part of the heap tuple header. Per documentation:
There is a fixed-size header (occupying 23 bytes on most machines),
followed by an optional null bitmap ...
Your 32 nullable columns are unsuspicious for two reasons:
The null bitmap is added per row, and only if there is at least one actual NULL
value in the row. Nullable columns have no direct impact, only actual NULL
values do. If the null bitmap is allocated, it's always allocated completely (all or nothing). The actual size of the null bitmap is 1 bit per column, rounded up to the next byte. Per current souce code:
#define BITMAPLEN(NATTS) (((int)(NATTS) + 7) / 8)
The null bitmap is allocated after the heap tuple header and followed by an optional OID and then row data. The start of an OID or row data is indicated by t_hoff
in the header. Per comment source code:
Note that t_hoff must be a multiple of MAXALIGN.
There is one free byte after the heap tuple header, which occupies 23 bytes. So the null bitmap for rows up to 8 columns effectively comes at no additional cost. With the 9th column in the table, t_hoff
is advanced another MAXALIGN
(typically 8) bytes to provide for another 64 columns. So the next border would be at 72 columns.
To display control information of a PostgreSQL database cluster (incl. MAXALIGN
), example for a typical installation of Postgres 9.3 on a Debian machine:
sudo /usr/lib/postgresql/9.3/bin/pg_controldata /var/lib/postgresql/9.3/main
I updated instructions in the related answer you quoted.
All that aside, even if your ALTER TABLE
statement triggers a whole table rewrite (which it probably does, changing a data type), 250K are really not that much and would be a matter of seconds on any halfway decent machine (unless rows are unusually big). 10 minutes or more indicate a completely different problem. Your statement is waiting to get a lock on the table, most likely.
The growing number of entries in pg_stat_activity
means more open transactions - indicates concurrent access on the table (most likely) that has to wait for the operation to finish.
A few shots in the dark
Check for possible table bloat, try a gentle VACUUM mytable
or a more aggressive VACUUM FULL mytable
- which might encounter the same concurrency issues, since this form also acquires an exclusive lock. You could try pg_repack instead ...
I would start by inspecting possible issues with indexes, triggers, foreign key or other constraints, especially those involving the column. Especially a corrupted index might be involved? Try REINDEX TABLE mytable;
or DROP
all of them and re-add them after ALTER TABLE
in the same transaction.
Try running the command in the night or whenever there is not much load.
A brute-force method would be to stop access to the server, then try again:
Without being able to pin it down, upgrading to the current version or the upcoming 9.4 in particular might help. There have been several improvements for big tables and for locking details. But if there is something broken in your DB, you should probably figure that out first.
Please note that I am not a MySQL developer, I use MS SQL Server. But the behavior in your post suggests the following:
It does not look like a deadlock to me, especially with the error messages:
InnoDB: Error: semaphore wait has lasted > 600 seconds
InnoDB: We intentionally crash the server, because it appears to be hung.
A deadlock normally would quickly determine which connection to roll back instead of processing up to 600 seconds before crashing the server.
Likely what is happening is that the table metadata change (the ALTER TABLE) cannot happen while there are other transactions using the table.
http://dev.mysql.com/doc/refman/5.6/en/metadata-locking.html
The link says, in part: "To ensure transaction serializability, the server must not permit one session to perform a data definition language (DDL) statement on a table that is used in an uncompleted explicitly or implicitly started transaction in another session."
EDIT: Sorry for the misunderstanding regarding deadlocks.
Per this post, https://stackoverflow.com/questions/24860111/warning-a-long-semaphore-wait
Ensure that: Innodb_adaptive_hash_index=0
Best Answer
Yes, it locks the table. From the docs on MySQL 8,
And from the docs you linked, it's pretty explicit
As you've stated, you're on 10.2. So it looks like adding a column will require rebuilding the whole table.
As to what happens when you can't receive a lock,
Yes, that's generally what happens during a lock, but be aware that's not always what happens. Sometimes statements and transactions give up waiting. Sometimes backends and pools get reaped when they're stuck waiting. It's always safer to do this during downtime, to have timeouts, and to catch errors from libraries when the timeouts expire. So long as you're using transactions, things rollback if something is triggered and can't get it's lock before timeout -- all will be kosher.