MySQL – Select Column Names with Non-Null Entries

database-designMySQLnullselect

I would like to have a list of those columns of a table that have at least one not-NULL data entries in them.

In other words, I would like to get the column names for which the following returns at least one entry:

SELECT DISTINCT column_name FROM table WHERE column_name IS NOT NULL

I tried the following:

SELECT column_name
FROM information_schema.columns
WHERE table_name = "table_name"
AND EXISTS (
    SELECT DISTINCT column_name FROM table_name WHERE column_name IS NOT NULL
)

But this also returns the column names where all the entries are NULL.

So how do I get only those columns with non-NULL entries?

Best Answer

Let's pick a sample table on my machine:

mysql> show create table weisci_jaws_staging2.users\G
*************************** 1. row ***************************
       Table: users
Create Table: CREATE TABLE `users` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `username` varchar(32) NOT NULL DEFAULT '',
  `passwd` varchar(32) NOT NULL DEFAULT '',
  `user_type` tinyint(4) DEFAULT '2',
  `recovery_key` varchar(48) DEFAULT NULL,
  `name` varchar(255) DEFAULT NULL,
  `email` varchar(255) DEFAULT NULL,
  `url` varchar(255) DEFAULT NULL,
  `timezone` varchar(5) DEFAULT NULL,
  `language` varchar(5) DEFAULT NULL,
  `theme` varchar(24) DEFAULT NULL,
  `editor` varchar(24) DEFAULT NULL,
  `last_login` datetime DEFAULT NULL,
  `createtime` datetime DEFAULT NULL,
  `updatetime` datetime DEFAULT NULL,
  `change_passwd` tinyint(1) NOT NULL DEFAULT '1',
  `never_expire` tinyint(1) NOT NULL DEFAULT '1',
  `bad_passwd_count` smallint(6) DEFAULT '0',
  `last_access` bigint(20) DEFAULT '0',
  `enabled` tinyint(1) NOT NULL DEFAULT '1',
  PRIMARY KEY (`id`),
  UNIQUE KEY `users_username_idx` (`username`)
) ENGINE=MyISAM AUTO_INCREMENT=160 DEFAULT CHARSET=utf8
1 row in set (0.02 sec)

mysql> select count(1) from weisci_jaws_staging2.users;
+----------+
| count(1) |
+----------+
|      117 |
+----------+
1 row in set (0.00 sec)

mysql>

With this table, here are two questions:

  • Which columns are nullable ?
  • Which columns are not nullable ?

This query will find out for you:

select is_nullable,GROUP_CONCAT(column_name) column_list
from information_schema.columns
where table_schema = 'weisci_jaws_staging2'
and   table_name = 'users'
group by is_nullable;

Check out the result of that query for the table:

mysql> select is_nullable,GROUP_CONCAT(column_name) column_list
    -> from information_schema.columns
    -> where table_schema = 'weisci_jaws_staging2'
    -> and   table_name = 'users'
    -> group by is_nullable;
+-------------+------------------------------------------------------------------------------------------------------------------------------------+
| is_nullable | column_list                                                                                                                        |
+-------------+------------------------------------------------------------------------------------------------------------------------------------+
| NO          | id,never_expire,change_passwd,enabled,username,passwd                                                                              |
| YES         | recovery_key,last_access,bad_passwd_count,updatetime,createtime,last_login,editor,user_type,language,timezone,url,email,name,theme |
+-------------+------------------------------------------------------------------------------------------------------------------------------------+
2 rows in set (0.01 sec)

mysql>

OK, we have two lists. What do we learn from this?

  • If you get two lists, no need to inspect the actual table because the table has non-NULL columns by definition.
  • If you get one list, then
    • If you get only is_nullable='NO', no need to inspect the actual table because the table has non-NULL columns by definition.
    • If you get only is_nullable='YES', the actual table would be a little brittle. There would be no PRIMARY KEY, you poor, tormented soul !!! NOW, you have to resort to reading every row from the actual table.

If you are looking for just the non-null columns, then this would be your desired query:

select GROUP_CONCAT(column_name) nonnull_columns
from information_schema.columns
where table_schema = 'weisci_jaws_staging2'
and   table_name = 'users'
and   is_nullable = 'NO';

Here is the output of this query:

mysql> select GROUP_CONCAT(column_name) nonnull_columns
    -> from information_schema.columns
    -> where table_schema = 'weisci_jaws_staging2'
    -> and   table_name = 'users'
    -> and   is_nullable = 'NO';
+-------------------------------------------------------+
| nonnull_columns                                       |
+-------------------------------------------------------+
| id,username,passwd,change_passwd,never_expire,enabled |
+-------------------------------------------------------+
1 row in set (0.01 sec)

mysql>

Removing the GROUP_CONCAT, you get this:

mysql> select column_name nonnull_column
    -> from information_schema.columns
    -> where table_schema = 'weisci_jaws_staging2'
    -> and   table_name = 'users'
    -> and   is_nullable = 'NO';
+----------------+
| nonnull_column |
+----------------+
| id             |
| username       |
| passwd         |
| change_passwd  |
| never_expire   |
| enabled        |
+----------------+
6 rows in set (0.01 sec)

mysql>

Give it a Try !!!

NOTE : Please notice that I do not need to read the actual table's data content. That's far more efficient than reading the entire table.

UPDATE 2012-11-15 13:40 EDT

The code from @sensware's answer gives NULL columns. The original question asked for non-NULL columns. I augmented the code to test just my table:

SET group_concat_max_len = 4294967295; -- to overcome default 1KB limitation
SELECT CONCAT(
         'SELECT * FROM ('
       ,  GROUP_CONCAT(
            CONCAT(
              'SELECT ', QUOTE(TABLE_NAME), ' AS `table`,'
            , 'IF('
            ,   'COUNT(`', REPLACE(COLUMN_NAME, '`', '``'), '`),'
            ,   'NULL,'
            ,    QUOTE(COLUMN_NAME)
            , ') AS `column` '
            , 'FROM `',
            REPLACE(TABLE_SCHEMA, '`', '``'), '`.`',
            REPLACE(TABLE_NAME, '`', '``'), '`'
            )
            SEPARATOR ' UNION ALL '
         )
       , ') t WHERE `column` IS NOT NULL'
       )
INTO   @sql
FROM   INFORMATION_SCHEMA.COLUMNS
WHERE  TABLE_SCHEMA = 'weisci_jaws_staging2'
AND    TABLE_NAME = 'users';
SELECT @sql\G
PREPARE stmt FROM @sql;
EXECUTE stmt;
DEALLOCATE PREPARE stmt;

Here is the output:

mysql> SET group_concat_max_len = 4294967295; -- to overcome default 1KB limitation
Query OK, 0 rows affected (0.00 sec)

mysql> SELECT CONCAT(
    ->          'SELECT * FROM ('
    ->        ,  GROUP_CONCAT(
    ->             CONCAT(
    ->               'SELECT ', QUOTE(TABLE_NAME), ' AS `table`,'
    ->             , 'IF('
    ->             ,   'COUNT(`', REPLACE(COLUMN_NAME, '`', '``'), '`),'
    ->             ,   'NULL,'
    ->             ,    QUOTE(COLUMN_NAME)
    ->             , ') AS `column` '
    ->             , 'FROM `',
    ->             REPLACE(TABLE_SCHEMA, '`', '``'), '`.`',
    ->             REPLACE(TABLE_NAME, '`', '``'), '`'
    ->             )
    ->             SEPARATOR ' UNION ALL '
    ->          )
    ->        , ') t WHERE `column` IS NOT NULL'
    ->        )
    -> INTO   @sql
    -> FROM   INFORMATION_SCHEMA.COLUMNS
    -> WHERE  TABLE_SCHEMA = 'weisci_jaws_staging2'
    -> AND    TABLE_NAME = 'users';
Query OK, 1 row affected (0.02 sec)

mysql> SELECT @sql\G
*************************** 1. row ***************************
@sql: SELECT * FROM (SELECT 'users' AS `table`,IF(COUNT(`id`),NULL,'id') AS `column` FROM `weisci_jaws_staging2`.`users` UNION ALL SELECT 'users' AS `table`,IF(COUNT(`username`),NULL,'username') AS `column` FROM `weisci_jaws_staging2`.`users` UNION ALL SELECT 'users' AS `table`,IF(COUNT(`passwd`),NULL,'passwd') AS `column` FROM `weisci_jaws_staging2`.`users` UNION ALL SELECT 'users' AS `table`,IF(COUNT(`user_type`),NULL,'user_type') AS `column` FROM `weisci_jaws_staging2`.`users` UNION ALL SELECT 'users' AS `table`,IF(COUNT(`recovery_key`),NULL,'recovery_key') AS `column` FROM `weisci_jaws_staging2`.`users` UNION ALL SELECT 'users' AS `table`,IF(COUNT(`name`),NULL,'name') AS `column` FROM `weisci_jaws_staging2`.`users` UNION ALL SELECT 'users' AS `table`,IF(COUNT(`email`),NULL,'email') AS `column` FROM `weisci_jaws_staging2`.`users` UNION ALL SELECT 'users' AS `table`,IF(COUNT(`url`),NULL,'url') AS `column` FROM `weisci_jaws_staging2`.`users` UNION ALL SELECT 'users' AS `table`,IF(COUNT(`timezone`),NULL,'timezone') AS `column` FROM `weisci_jaws_staging2`.`users` UNION ALL SELECT 'users' AS `table`,IF(COUNT(`language`),NULL,'language') AS `column` FROM `weisci_jaws_staging2`.`users` UNION ALL SELECT 'users' AS `table`,IF(COUNT(`theme`),NULL,'theme') AS `column` FROM `weisci_jaws_staging2`.`users` UNION ALL SELECT 'users' AS `table`,IF(COUNT(`editor`),NULL,'editor') AS `column` FROM `weisci_jaws_staging2`.`users` UNION ALL SELECT 'users' AS `table`,IF(COUNT(`last_login`),NULL,'last_login') AS `column` FROM `weisci_jaws_staging2`.`users` UNION ALL SELECT 'users' AS `table`,IF(COUNT(`createtime`),NULL,'createtime') AS `column` FROM `weisci_jaws_staging2`.`users` UNION ALL SELECT 'users' AS `table`,IF(COUNT(`updatetime`),NULL,'updatetime') AS `column` FROM `weisci_jaws_staging2`.`users` UNION ALL SELECT 'users' AS `table`,IF(COUNT(`change_passwd`),NULL,'change_passwd') AS `column` FROM `weisci_jaws_staging2`.`users` UNION ALL SELECT 'users' AS `table`,IF(COUNT(`never_expire`),NULL,'never_expire') AS `column` FROM `weisci_jaws_staging2`.`users` UNION ALL SELECT 'users' AS `table`,IF(COUNT(`bad_passwd_count`),NULL,'bad_passwd_count') AS `column` FROM `weisci_jaws_staging2`.`users` UNION ALL SELECT 'users' AS `table`,IF(COUNT(`last_access`),NULL,'last_access') AS `column` FROM `weisci_jaws_staging2`.`users` UNION ALL SELECT 'users' AS `table`,IF(COUNT(`enabled`),NULL,'enabled') AS `column` FROM `weisci_jaws_staging2`.`users`) t WHERE `column` IS NOT NULL
1 row in set (0.00 sec)

mysql> PREPARE stmt FROM @sql;
Query OK, 0 rows affected (0.01 sec)
Statement prepared

mysql> EXECUTE stmt;
+-------+--------+
| table | column |
+-------+--------+
| users | theme  |
+-------+--------+
1 row in set (0.00 sec)

mysql> DEALLOCATE PREPARE stmt;
Query OK, 0 rows affected (0.00 sec)

mysql>

This give NULL columns. The original question asked for non-NULL columns. I'll change the code to generated non-NULL. I'll do that by flipping the order of the IF..COUNT:

SET group_concat_max_len = 4294967295; -- to overcome default 1KB limitation
SELECT CONCAT(
         'SELECT * FROM ('
       ,  GROUP_CONCAT(
            CONCAT(
              'SELECT ', QUOTE(TABLE_NAME), ' AS `table`,'
            , 'IF('
            ,   'COUNT(`', REPLACE(COLUMN_NAME, '`', '``'), '`),'
            ,    QUOTE(COLUMN_NAME)
            ,   ',NULL'
            , ') AS `column` '
            , 'FROM `',
            REPLACE(TABLE_SCHEMA, '`', '``'), '`.`',
            REPLACE(TABLE_NAME, '`', '``'), '`'
            )
            SEPARATOR ' UNION ALL '
         )
       , ') t WHERE `column` IS NOT NULL'
       )
INTO   @sql
FROM   INFORMATION_SCHEMA.COLUMNS
WHERE  TABLE_SCHEMA = 'weisci_jaws_staging2'
AND    TABLE_NAME = 'users';
SELECT @sql\G
PREPARE stmt FROM @sql;
EXECUTE stmt;
DEALLOCATE PREPARE stmt;

Let's run it now...

mysql> SET group_concat_max_len = 4294967295; -- to overcome default 1KB limitation
EXECUTE stmt;
DEALLOCATE PREPARE stmt;
Query OK, 0 rows affected (0.00 sec)

mysql> SELECT CONCAT(
    ->          'SELECT * FROM ('
    ->        ,  GROUP_CONCAT(
    ->             CONCAT(
    ->               'SELECT ', QUOTE(TABLE_NAME), ' AS `table`,'
    ->             , 'IF('
    ->             ,   'COUNT(`', REPLACE(COLUMN_NAME, '`', '``'), '`),'
    ->             ,    QUOTE(COLUMN_NAME)
    ->             ,   ',NULL'
    ->             , ') AS `column` '
    ->             , 'FROM `',
    ->             REPLACE(TABLE_SCHEMA, '`', '``'), '`.`',
    ->             REPLACE(TABLE_NAME, '`', '``'), '`'
    ->             )
    ->             SEPARATOR ' UNION ALL '
    ->          )
    ->        , ') t WHERE `column` IS NOT NULL'
    ->        )
    -> INTO   @sql
    -> FROM   INFORMATION_SCHEMA.COLUMNS
    -> WHERE  TABLE_SCHEMA = 'weisci_jaws_staging2'
    -> AND    TABLE_NAME = 'users';
Query OK, 1 row affected (0.01 sec)

mysql> SELECT @sql\G
*************************** 1. row ***************************
@sql: SELECT * FROM (SELECT 'users' AS `table`,IF(COUNT(`id`),'id',NULL) AS `column` FROM `weisci_jaws_staging2`.`users` UNION ALL SELECT 'users' AS `table`,IF(COUNT(`username`),'username',NULL) AS `column` FROM `weisci_jaws_staging2`.`users` UNION ALL SELECT 'users' AS `table`,IF(COUNT(`passwd`),'passwd',NULL) AS `column` FROM `weisci_jaws_staging2`.`users` UNION ALL SELECT 'users' AS `table`,IF(COUNT(`user_type`),'user_type',NULL) AS `column` FROM `weisci_jaws_staging2`.`users` UNION ALL SELECT 'users' AS `table`,IF(COUNT(`recovery_key`),'recovery_key',NULL) AS `column` FROM `weisci_jaws_staging2`.`users` UNION ALL SELECT 'users' AS `table`,IF(COUNT(`name`),'name',NULL) AS `column` FROM `weisci_jaws_staging2`.`users` UNION ALL SELECT 'users' AS `table`,IF(COUNT(`email`),'email',NULL) AS `column` FROM `weisci_jaws_staging2`.`users` UNION ALL SELECT 'users' AS `table`,IF(COUNT(`url`),'url',NULL) AS `column` FROM `weisci_jaws_staging2`.`users` UNION ALL SELECT 'users' AS `table`,IF(COUNT(`timezone`),'timezone',NULL) AS `column` FROM `weisci_jaws_staging2`.`users` UNION ALL SELECT 'users' AS `table`,IF(COUNT(`language`),'language',NULL) AS `column` FROM `weisci_jaws_staging2`.`users` UNION ALL SELECT 'users' AS `table`,IF(COUNT(`theme`),'theme',NULL) AS `column` FROM `weisci_jaws_staging2`.`users` UNION ALL SELECT 'users' AS `table`,IF(COUNT(`editor`),'editor',NULL) AS `column` FROM `weisci_jaws_staging2`.`users` UNION ALL SELECT 'users' AS `table`,IF(COUNT(`last_login`),'last_login',NULL) AS `column` FROM `weisci_jaws_staging2`.`users` UNION ALL SELECT 'users' AS `table`,IF(COUNT(`createtime`),'createtime',NULL) AS `column` FROM `weisci_jaws_staging2`.`users` UNION ALL SELECT 'users' AS `table`,IF(COUNT(`updatetime`),'updatetime',NULL) AS `column` FROM `weisci_jaws_staging2`.`users` UNION ALL SELECT 'users' AS `table`,IF(COUNT(`change_passwd`),'change_passwd',NULL) AS `column` FROM `weisci_jaws_staging2`.`users` UNION ALL SELECT 'users' AS `table`,IF(COUNT(`never_expire`),'never_expire',NULL) AS `column` FROM `weisci_jaws_staging2`.`users` UNION ALL SELECT 'users' AS `table`,IF(COUNT(`bad_passwd_count`),'bad_passwd_count',NULL) AS `column` FROM `weisci_jaws_staging2`.`users` UNION ALL SELECT 'users' AS `table`,IF(COUNT(`last_access`),'last_access',NULL) AS `column` FROM `weisci_jaws_staging2`.`users` UNION ALL SELECT 'users' AS `table`,IF(COUNT(`enabled`),'enabled',NULL) AS `column` FROM `weisci_jaws_staging2`.`users`) t WHERE `column` IS NOT NULL
1 row in set (0.00 sec)

mysql> PREPARE stmt FROM @sql;
Query OK, 0 rows affected (0.00 sec)
Statement prepared

mysql> EXECUTE stmt;
+-------+------------------+
| table | column           |
+-------+------------------+
| users | id               |
| users | username         |
| users | passwd           |
| users | user_type        |
| users | recovery_key     |
| users | name             |
| users | email            |
| users | url              |
| users | timezone         |
| users | language         |
| users | editor           |
| users | last_login       |
| users | createtime       |
| users | updatetime       |
| users | change_passwd    |
| users | never_expire     |
| users | bad_passwd_count |
| users | last_access      |
| users | enabled          |
+-------+------------------+
19 rows in set (0.01 sec)

mysql> DEALLOCATE PREPARE stmt;
Query OK, 0 rows affected (0.00 sec)

mysql>

OK it works now. There is a problem still present. The query requires reading the entire table. My test table only has 117 rows and 20 columns. What about bigger tables with millions of rows or dozens of columns? I am not going to speculate because I know that the code would be orders of magnitude worse.

That's why I recommend my answer

select GROUP_CONCAT(column_name) nonnull_columns
from information_schema.columns
where table_schema = 'weisci_jaws_staging2'
and   table_name = 'users'
and   is_nullable = 'NO';

or

select column_name nonnull_column
from information_schema.columns
where table_schema = 'weisci_jaws_staging2'
and   table_name = 'users'
and   is_nullable = 'NO';

because the actual data content does not have to be inspected.

The augmented code I made should only be used on a table where all column allow NULL values, which is extremely rare.