SQL Server – Efficiently Check for NULL Values in All Columns for Multiple Tables

performancequery-performancesql serversql-server-2008t-sql

Say I have three tables or four tables that I want to check for values present and I don't want any of them to be null. IF they are null, I will send an email notification per row. What would be the best way to go about this? Should I make a stored procedure with IF statements separately?

I say this because all these tables are not having any constraints, they just pull data out of an excel sheet. So making a table type variable and unioning or inner joining returns nothing as even the column names are different, but the information they store is the same kind of information. So for example if in one table, a column exists called 'DueDate', in the other table, this same sort of information is called 'PaymentDate'. How would I perform such a check on all these tables?

Update: Here's what I've thought of as of now: Say the two tables are A and B, then

DECLARE @messageBody NVARCHAR(MAX)
DECLARE @iterator int
DECLARE @Rowcount int
DECLARE @varTableA TABLE(tablecounter int IDENTITY(1,1), DueDate date, Account NVARCHAR(20), Retailer NVARCHAR(20), Interest float)


INSERT INTO @varTableA (DueDate, Account, Retailer, Interest)
SELECT DueDate, Account, Retailer, Interest
FROM TableA 
WHERE DueDate IS NULL OR Account IS NULL OR Retailer IS NULL OR Interest IS NULL

SET @Rowcount = @@ROWCOUNT
SET @iterator = 1


WHILE(@iterator <= @Rowcount)
BEGIN 

SET @messageBody = (SELECT 'No values in following columns' + 'Duedate: ' + ISNULL(Duedate, 'No value') + CHAR(10) + CHAR(13) 
+ 'Account: ' + ISNULL(Account, 'No value') + CHAR(10) + CHAR(13)
+ 'Retailer: ' + ISNULL(Retailer, 'No Value') + CHAR(10) + CHAR(13)
+ 'Interest: ' + ISNULL(Interest, 'No Value') + CHAR(10) + CHAR(13)

                    FROM @varTableA 
                    WHERE @tablecounter = @iterator)

EXEC msdb.dbo.sp_send_dbmail @profilename = [name], @recipients = [reclist], @subject = [sub], @body = @messageBody
SET @iterator = @iterator + 1
END

And to repeat this in either the same or a separate stored Procedure for TableB and Table C. Is there a better way of doing this or do you think I am going about this the right way?

Best Answer

For part 1 of your question, I created these three tables in an empty database:

CREATE TABLE dbo.table1(a INT, b INT);
CREATE TABLE dbo.table2(a INT, b INT);
CREATE TABLE dbo.table3(a INT, b INT);

INSERT dbo.table1(a,b) VALUES(1,1),(1,2),(1,NULL);
INSERT dbo.table2(a,b) VALUES(1,1),(NULL,NULL),(1,NULL);
INSERT dbo.table3(a,b) VALUES(1,1),(1,2),(1,2);

Then I wrote this code to extract various bits of NULL information:

CREATE TABLE #x
(
  t           NVARCHAR(512),                     -- table name
  nullrows    INT NOT NULL DEFAULT 0,            -- number of rows with at least one NULL
  nullvalues  INT NOT NULL DEFAULT 0,            -- total NULL values in table
  nulldist    NVARCHAR(MAX) NOT NULL DEFAULT N'' -- how many NULLs in each column
);

INSERT #x(t) VALUES(N'dbo.table1'),(N'dbo.table2'),(N'dbo.table3');

Now some dynamic SQL fun derived from metadata:

DECLARE @sql NVARCHAR(MAX) = N'';


SELECT @sql = @sql + N'
  UPDATE #x SET nullrows = nullrows 
    + (SELECT COUNT(*) FROM ' + t.t 
    + N' WHERE (' + STUFF((SELECT N' OR ' 
    + QUOTENAME(c.name) + N' IS NULL'
  FROM sys.columns AS c
  WHERE OBJECT_ID(t.t) = c.[object_id]
  FOR XML PATH, TYPE).value(N'.[1]',N'nvarchar(max)'),
    1, 4, N'') + N')) WHERE t = N''' + t.t + N''';'
FROM #x AS t;

EXEC sys.sp_executesql @sql;
SET @sql = N'';

SELECT @sql = @sql + N'
  IF EXISTS (SELECT 1 FROM ' + t + N' WHERE ' 
    + QUOTENAME(c.name) + N' IS NULL)
  UPDATE #x SET nullvalues = nullvalues 
    + (SELECT COUNT(*) FROM ' + t + N' WHERE ' 
    + QUOTENAME(c.name) + N' IS NULL),
    nulldistribution = nulldistribution + '''
    + c.name + N':'' + RTRIM((SELECT COUNT(*) FROM ' 
    + t + N' WHERE ' + QUOTENAME(c.name) + N' IS NULL))
    + N'','' WHERE t = N''' + t + N''';'
  FROM #x AS t
  INNER JOIN sys.columns AS c
  ON OBJECT_ID(t.t) = c.[object_id];

EXEC sys.sp_executesql @sql;

SELECT * FROM #x;

Results:

t             nullrows   nullvalues   nulldistribution
dbo.table1    1          1            b:1,
dbo.table2    2          3            a:1,b:2,
dbo.table3    0          0  

I'll leave it as an exercise to the reader how to send an e-mail based on those results - that does not exactly seem like the hard part of this.

For part 2 of your question, as I said in a comment:

There is no code you could possibly write that would tell you that DueDate and PaymentDate contain the same kind of information, other than checking that they share the same data type. This is what source control and documentation are for.

I would argue that this is a different and unanswerable question.