Sql-server – Inaccurate ‘Actual’ row counts in parallel plan

sql server

This is a purely academic question, in so much that it isn't causing a problem and I'm just interested to hear any explanations for the behaviour.

Take a standard issue Itzik Ben-Gan cross-join CTE tally table:

USE [master]
GO

SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO

CREATE FUNCTION [dbo].[TallyTable] 
(   
    @N INT
)
RETURNS TABLE WITH SCHEMABINDING AS
RETURN 
(
    WITH 
    E1(N) AS 
    (
        SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL 
        SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL 
        SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1
    )                                       -- 1*10^1 or 10 rows
    , E2(N) AS (SELECT 1 FROM E1 a, E1 b)   -- 1*10^2 or 100 rows
    , E4(N) AS (SELECT 1 FROM E2 a, E2 b)   -- 1*10^4 or 10,000 rows
    , E8(N) AS (SELECT 1 FROM E4 a, E4 b)   -- 1*10^8 or 100,000,000 rows

    SELECT TOP (@N) ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) AS N FROM E8 
)
GO

Issue a query that will create a 1 million row number table:

SELECT
    COUNT(N)
FROM
    dbo.TallyTable(1000000) tt

Take a look at the parallel execution plan for this query:

Parallel execution plan

Note the 'actual' row count prior to the gather streams operator is 1,004,588. After the gather streams operator the row count is the expected 1,000,000. Stranger still, the value is not consistent and will vary from run to run. The result of the COUNT is always correct.

Issue the query again, forcing non-parallel plan:

SELECT
    COUNT(N)
FROM
    dbo.TallyTable(1000000) tt
OPTION (MAXDOP 1)

This time all operators show the correct 'actual' row counts.

Non-parallel execution plan

I've tried this on 2005SP3 and 2008R2 so far, same results on both. Any thoughts as to what might cause this?

Best Answer

Rows are passed across exchanges internally from producer to consumer thread in packets (hence CXPACKET - class exchange packet), rather than a row-at-a-time. There is a certain amount of buffering inside the exchange. Also, the call to shut down the pipeline from the consumer side of the Gather Streams has to be passed in a control packet back to the producer threads. Scheduling and other internal considerations mean that parallel plans always have a certain 'stopping distance'.

As a consequence, you will often see this sort of row count difference where less than the whole potential rowset of a sub-tree is actually required. In this case, the TOP brings execution to an 'early end'.

More information: