Sql-server – In SQL Server 2008, how can I get all the columns that are marked as nullable in their schemas even though no records contain NULL for those columns

nullsql-server-2008

In my legacy database, a lot of the time the schema needlessly allows null entries for particular columns. To help find out which columns in particular, I need to do some querying of SQL Server 2008 that beyond my level of expertise. My aim is to tighten up the schema a bit and save myself from dealing with the null case in my code.

A crude way (option 1) to help me solve my problem is to just get all records (and select *) that contain at least one null entry, then eye scan for columns that contain no null entries. I could of course just use Where columnA IS NULL OR columnB IS NULL OR columnC IS NULL, but this gets tedious for tables with lots of columns. Some tables in this database contain more than twenty columns! Also, there's about 500 tables in total. So, general solutions are best.

Here are three options for answers that would satisfy me:

  • Option 1 : "get me all the records where at least one of the columns is null". I will eyescan for columns with no null entries.
  • Option 2 : An even better answer would be some script that gets a list of column names that contain at least one null entry. Then I will see which nullable columns are not in this list manually.
  • Option 3: The best answer would get me a list of column names that are marked as nullable in their schema even though no records exist with a null entry in those columns.

Thanks!

Best Answer

Martin Smith's answer will serve very well to get you all the columns you need for an entire database in SQL 2008. Very nice!

Here is how I did it in the days before SQL had CTEs and PIVOT. This will be compatible with older versions of SQL where Martin's solution won't work, and still works in 2008 as well, but with poorer performance than his solution.

USE MyDB

SET NOCOUNT ON

CREATE TABLE ##nullable  (
    ID INT IDENTITY(1,1),
    SchName VARCHAR(128),
    TblName VARCHAR(128),
    ColName VARCHAR(128),
    hasNulls BIT,
    PRIMARY KEY(ID)
    )

DECLARE @currTbl VARCHAR(128)
DECLARE @currCol VARCHAR(128)
DECLARE @currSch VARCHAR(128)
DECLARE @limit INT
DECLARE @i INT
DECLARE @sql NVARCHAR(4000)

INSERT INTO ##nullable (
    SchName,
    TblName,
    ColName,
    hasNulls
    )
SELECT
    c.TABLE_SCHEMA,
    c.TABLE_NAME,
    c.COLUMN_NAME,
    0 AS hasNulls
FROM INFORMATION_SCHEMA.COLUMNS c
INNER JOIN INFORMATION_SCHEMA.TABLES t
ON c.TABLE_CATALOG = t.TABLE_CATALOG
AND c.TABLE_NAME = t.TABLE_NAME
AND c.TABLE_SCHEMA = t.TABLE_SCHEMA
WHERE c.IS_NULLABLE = 'YES'
AND t.TABLE_TYPE = 'BASE TABLE'

SET @limit = (SELECT MAX(ID) FROM ##nullable)
SET @i = 1

WHILE @i <= @limit
BEGIN   
    SELECT @currSch = SchName,
        @currTbl = TblName,
        @currCol = ColName
    FROM ##nullable 
    WHERE ID = @i

    SET @sql = 'UPDATE ##nullable 
        SET hasNulls = 1
        WHERE ID = ' + CAST(@i AS VARCHAR(20)) + ' 
        AND EXISTS (SELECT 1 FROM ' + QUOTENAME(@currSch) + '.'+ QUOTENAME(@currTbl) + ' 
        WHERE ' + QUOTENAME(@currCol) + ' IS NULL)'

    EXEC(@sql)

    SET @i = @i + 1     
END

SELECT DISTINCT * FROM ##nullable 
WHERE hasNulls = 0  

DROP TABLE ##nullable