SQL Server Performance – Poor Cardinality Estimate Disqualifies INSERT from Minimal Logging?

insertperformancesql serversql server 2014transaction-log

Why is the second INSERT statement ~5x slower than the first?

From the amount of log data generated, I think that the second is not qualifying for minimal logging. However, the documentation in the Data Loading Performance Guide indicates that both inserts should be able to be minimally logged. So if minimal logging is the key performance difference, why is it that the second query does not qualify for minimal logging? What can be done to improve the situation?

Query #1: Inserting 5MM rows using INSERT…WITH (TABLOCK)

Consider the following query, which inserts 5MM rows into a heap. This query executes in 1 second and generates 64MB of transaction log data as reported by sys.dm_tran_database_transactions.

CREATE TABLE dbo.minimalLoggingTest (n INT NOT NULL)
GO
INSERT INTO dbo.minimalLoggingTest WITH (TABLOCK) (n)
SELECT n
-- Any table/view/sub-query that correctly estimates that it will generate 5MM rows
FROM dbo.fiveMillionNumbers
-- Provides greater consistency on my laptop, where other processes are running
OPTION (MAXDOP 1)
GO

Query #2: Inserting the same data, but SQL underestimates the # of rows

Now consider this very similar query, which operates on exactly the same data but happens to draw from a table (or complex SELECT statement with many joins in my actual production case) where the cardinality estimate is too low. This query executes in 5.5 seconds and generates 461MB of transaction log data.

CREATE TABLE dbo.minimalLoggingTest (n INT NOT NULL)
GO
INSERT INTO dbo.minimalLoggingTest WITH (TABLOCK) (n)
SELECT n
-- Any table/view/sub-query that produces 5MM rows but SQL estimates just 1000 rows
FROM dbo.fiveMillionNumbersBadEstimate
-- Provides greater consistency on my laptop, where other processes are running
OPTION (MAXDOP 1)
GO

Full script

See this Pastebin for a full set of scripts to generate the test data and execute either of these scenarios. Note that you must use a database that is in the SIMPLE recovery model.

Business context

We are semi-frequently moving around millions of rows of data, and it's important to have these operations be as efficient as possible, both in terms of the execution time and the disk I/O load. We had initially been under the impression that creating a heap table and using INSERT...WITH (TABLOCK) was a good way to do this, but have now become less confident given that we observed the situation demonstrated above in an actual production scenario (albeit with more complex queries, not the simplified version here).

Best Answer

Why is it that the second query does not qualify for minimal logging?

Minimal logging is available for the second query, but the engine chooses not to use it at runtime.

There is a minimum threshold for INSERT...SELECT below which it chooses not to use the bulk load optimizations. There is a cost involved in setting up a bulk rowset operation, and bulk-inserting only a few rows would not result in efficient space utilization.

What can be done to improve the situation?

Use one of the many other methods (e.g. SELECT INTO) that does not have this threshold. Alternatively, you might be able to rewrite the source query in some way to boost the estimated number of rows/pages over the threshold for INSERT...SELECT.

See also Geoff's self-answer for more useful information.


Possibly interesting trivia: SET STATISTICS IO reports logical reads for the target table only when bulk loading optimizations are not used.