In MS SQL Server 2005, how can I extract statistical information about NULL fraction (or NULL count) for a given column?
Same information is available in Oracle and PostgreSQL, and I heard that MS SQL also has statistics, but I don't know where to find it. Google didn't help much.
I hope I will hear something like
SELECT null_frac FROM pg_stats WHERE tablename='mytab' and attname='mycol';
/* PostgreSQL, stats gathered by ANALYZE */
or
SELECT nullcnt FROM custom_stats_table WHERE tabname='MYTAB' and colname='MYCOL';
/* Oracle, stats gathered by dbms_stats.get_column_stats */
I'm not asking how to compute these values myself – I know how to do it.
I need these stats for several thousands of columns, and computing this would take very long on the database that I'm working on. I need approximations, and they should be somewhere in the server – they are needed anyway for cost-based optimizer.
Best Answer
Nothing so easy. You would need to use
DBCC SHOW_STATISTICS
and look at the histogram, taking into account that the statistics might be filtered or multi column.Probably easier to just let SQL Server do it and generate an estimated execution plan for the statement
select * from foo where bar is null
SQL Server will then use appropriate statistics if they exist or create them if they don't (assuming that the auto create statistics option is enabled).You can then look at the estimated number of rows going into the
SELECT
iterator (This shows up asStatementEstRows
in the XML for purposes of parsing programatically)Example code