You did not specify PostgreSQL version. On 9.0.5 I have observed the same behavior (not sure if it's a bug, see comments below).
This is easy to workaround - you can first test for definedness to catch NULL, and if it passed, test for empty string.
if ( not defined $_TD->{ new }{ $column } ) {
elog( NOTICE, "New value in column $column is not defined" );
}
elsif ( $_TD->{ new }{ $column } eq '' ) {
elog( NOTICE, "New value in column $column is empty" );
}
if ( not defined $_TD->{ old }{ $column } ) {
elog( NOTICE, "Old value in column $column is not defined" );
}
elsif ( $_TD->{ old }{ $column } eq '' ) {
elog( NOTICE, "Old value in column $column is empty" );
}
Since the columns e
, k
, and n
can be NULL, I assume "100% empty" means NULL.
NULL storage is cheap. Each NULL "costs" one bit in the null bitmap for storage and otherwise hardly affects performance. Effective storage requirements depend on whether the null bitmap for each row already exists and has room for 3 more bits.
Tables with up to 8 user columns utilize a spare byte after the tuple header for the null bitmap in the current implementation. For any more, another MAXALIGN
(typically 8) bytes are allocated providing for another 64 columns (for a total 72 now). Etc. So the null bitmap effectively costs 8 bytes per row for most tables, including yours which has 18 columns.
The null bitmap is allocated completely or not at all per row and only if there is at least one actual NULL value. If all columns are defined NOT NULL
, there can never be one.
In the absence of other, more important considerations, move columns that are always (or mostly) NULL to the end of the row. Helps performance a tiny bit.
Answer to question 1.
The effective storage requirement is:
- 0 bytes per row where other columns are also NULL.
- 8 bytes (typically) per row where no other column is NULL, for allocating the null bitmap
Answer to question 2.
Dropping would not buy you much. Page cache would be the same or slightly less depending on 1. Handling queries, backups, source code and such would be simplified.
Further reading:
What you did not ask for
After a round of "column tetris" I suggest this table layout - again barring other, more important considerations:
id serial NOT NULL
, a integer
, g timestamp
, i timestamp
, b integer
, c integer
, h integer
, j integer
, l integer
, d smallint
, f smallint
, m smallint
, o varchar(36)
, p varchar(100)
, q varchar(100)
, k varchar(32) -- always NULL
, n smallint -- always NULL
, e timestamp -- always NULL
Saves at least 16 bytes per row compared to your original layout, probably around 20+, depending on your varchar
columns. Instructions for "column tetris" here:
How to reorder columns
The simplest method is to create a new table and copy data into it. Then you get a pristine new table without dead columns (and dead rows). You could sort (cluster) the physical order of rows in the process, too.
Changes in place are possible too, but there is no convenient tool (I would know of) to generally reorder the sequence of columns. Depending objects like views, functions (depends), FK constraints, indexes etc. are limiting factor here. Those can block you from deleting a column, you have to drop and recreate depending objects.
And of course, concurrent access to the table conflicts with DDL commands changing it. You need exclusive table locks in this case.
It's simple and cheap to delete completely empty columns and add them at the end of the row. Switching the sequence of populated columns in the middle of the table is not as simple or cheap. I would create a new table for that.
This related answer has recipes and links to more:
Best Answer
testbed:
function:
query:
result:
In addition you can get an approximate answer by querying the catalog - if
null_frac
is zero that indicates no nulls but should be double-checked against 'real' data: