Postgresql – find if the column of the PostgreSQL table is empty with minimum cost

postgresql

I need to find if there are any values written to the column but with minimum cost.

There is a table with ~1B rows and about 100 columns with data. I need to find columns that have no data at all (all nulls) and delete them. If I query like "SELECT * FROM my_table WHERE my_column IS NOT NULL LIMIT 1" then it costs me 1-2 minutes and I'm looking for the faster solution.

As far as I know (but not sure), if there are no data at all for the column, there is some property in the database saying that the chunk or the table has no data for this column, at least this happens when adding a new column to the existing table, so the database doesn't update all existing rows if the default value of a new column is NULL. I wonder if there is a fast way to get a result based on this info.

P.S. I use timescaleDB extension and the PK is a timestamp, if this changes anything

Best Answer

As far as I know (but not sure), if there are no data at all for the column, there is some property in the database saying that the chunk or the table has no data for this column, at least this happens when adding a new column to the existing table, so the database doesn't update all existing rows if the default value of a new column is NULL

If the row data ends early, it assumes all column after that for that particular row are NULL, or in newer versions, are whatever DEFAULT was at the time the column was added. But this says nothing at all about what might be in other rows.

You can't avoid scanning the table unless you a had a very strange constraint which forces the values to all be NULL. But you should be able to scan the table just once, not once for every column.

select count(col1), count(co2), count(col3)...count(col100) from the_table;

Related Solutions

PostgreSQL pl/perl trigger, differentiate null vs empty

You did not specify PostgreSQL version. On 9.0.5 I have observed the same behavior (not sure if it's a bug, see comments below).

This is easy to workaround - you can first test for definedness to catch NULL, and if it passed, test for empty string.

if ( not defined $_TD->{ new }{ $column } ) {
    elog( NOTICE, "New value in column $column is not defined" );
}
elsif ( $_TD->{ new }{ $column } eq '' ) {
    elog( NOTICE, "New value in column $column is empty" );
}
if ( not defined $_TD->{ old }{ $column } ) {
    elog( NOTICE, "Old value in column $column is not defined" );
}
elsif ( $_TD->{ old }{ $column } eq '' ) {
    elog( NOTICE, "Old value in column $column is empty" );
}

Postgresql – How to change the mount point for a column in postgresql table

Maybe I'm missing something, but it sounds as if that is a simple case of replacing the value:

update the_table
   set link = replace(link, '/ABC/ABC_DATA/Test/', '/CBF/CBF_DATA/Documents/Test_DATA/');

Best Answer

Related Solutions

PostgreSQL pl/perl trigger, differentiate null vs empty

Postgresql – How to change the mount point for a column in postgresql table

Related Question