Sql-server – Get NULL statistics for a column in MS SQL Server

sql serversql-server-2005statistics

In MS SQL Server 2005, how can I extract statistical information about NULL fraction (or NULL count) for a given column?

Same information is available in Oracle and PostgreSQL, and I heard that MS SQL also has statistics, but I don't know where to find it. Google didn't help much.

I hope I will hear something like

SELECT null_frac FROM pg_stats WHERE tablename='mytab' and attname='mycol';
/* PostgreSQL, stats gathered by ANALYZE */

or

SELECT nullcnt FROM custom_stats_table WHERE tabname='MYTAB' and colname='MYCOL';
/* Oracle, stats gathered by dbms_stats.get_column_stats */

I'm not asking how to compute these values myself – I know how to do it.
I need these stats for several thousands of columns, and computing this would take very long on the database that I'm working on. I need approximations, and they should be somewhere in the server – they are needed anyway for cost-based optimizer.

Best Answer

Nothing so easy. You would need to use DBCC SHOW_STATISTICS and look at the histogram, taking into account that the statistics might be filtered or multi column.

Probably easier to just let SQL Server do it and generate an estimated execution plan for the statement select * from foo where bar is null SQL Server will then use appropriate statistics if they exist or create them if they don't (assuming that the auto create statistics option is enabled).

You can then look at the estimated number of rows going into the SELECT iterator (This shows up as StatementEstRows in the XML for purposes of parsing programatically)

Example code

/*Create temporary stored procedure with query of interest*/
CREATE PROC dbo.#plan_test
AS
SELECT *
FROM master..spt_values
WHERE low IS NULL

GO

/*Generate an execution plan*/
SET FMTONLY ON
EXEC dbo.#plan_test
SET FMTONLY OFF;

/*Retrieve the plan and parse the estimated rows*/
WITH  XMLNAMESPACES ('http://schemas.microsoft.com/sqlserver/2004/07/showplan' as sql) 
SELECT  query_plan.value('(//sql:StmtSimple[1]/@StatementEstRows)[1]', 'float') AS EstimatedNumberOfRows
FROM sys.dm_exec_cached_plans 
CROSS APPLY sys.dm_exec_sql_text(plan_handle) 
CROSS APPLY sys.dm_exec_query_plan(plan_handle) 
WHERE sys.dm_exec_sql_text.objectid=OBJECT_ID('tempdb.dbo.#plan_test') 
     AND sys.dm_exec_sql_text.dbid=2

DROP PROC dbo.#plan_test