Sql-server – Splitting SQL query with many joins into smaller ones helps

join;optimizationsql server

We need to do some reporting every night on our SQL Server 2008 R2. Calculating the reports takes several hours. In order to shorten the time we precalculate a table. This table is created based on JOINining 12 quite big (tens of milions row) tables.

The calculation of this aggregation table took until few days ago cca 4 hours. Our DBA than split this big join into 3 smaller joins (each joining 4 tables). The temporary result is saved into a temporary table every time, which is used in the next join.

The result of the DBA enhancement is, that the aggregation table is calculated in 15 minutes. I wondered how is that possible. DBA told me that it is because the number of data the server must process is smaller. In other words, that in the big original join the server has to work with more data than in summed smaller joins. However, I would presume that optimizer would take care of doing it efficiently with the original big join, splitting the joins on its own and sending only the number of columns needed to next joins.

The other thing he has done is that he created an index on one of the temporary tables. However, once again I would think that the optimizer will create the appropriate hash tables if needed and altogether better optimize the computation.

I talked about this with our DBA, but he was himself uncertain about what cased the improvement in processing time. He just mentioned, that he would not blame the server as it can be overwhelming to compute such big data and that it is possible that the optimizer has hard time to predict the best execution plan … . This I understand, but I would like to have more defining answer as to exactly why.

So, the questions are:

What could possibly cause the big improvement?
Is it a standard procedure to split big joins into smaller?
Is the amount of data which server has to process really smaller in case of multiple smaller joins?

Here is the original query:

    Insert Into FinalResult_Base
SELECT       
    TC.TestCampaignContainerId,
    TC.CategoryId As TestCampaignCategoryId,
    TC.Grade,
    TC.TestCampaignId,    
    T.TestSetId
    ,TL.TestId
    ,TSK.CategoryId
    ,TT.[TestletId]
    ,TL.SectionNo
    ,TL.Difficulty
    ,TestletName = Char(65+TL.SectionNo) + CONVERT(varchar(4),6 - TL.Difficulty) 
    ,TQ.[QuestionId]
    ,TS.StudentId
    ,TS.ClassId
    ,RA.SubjectId
    ,TQ.[QuestionPoints] 
    ,GoodAnswer  = Case When TQ.[QuestionPoints] Is null Then 0
                      When TQ.[QuestionPoints] > 0 Then 1 
                      Else 0 End
    ,WrongAnswer = Case When TQ.[QuestionPoints] = 0 Then 1 
                      When TQ.[QuestionPoints] Is null Then 1
                     Else 0 End
    ,NoAnswer    = Case When TQ.[QuestionPoints] Is null Then 1 Else 0 End
    ,TS.Redizo
    ,TT.ViewCount
    ,TT.SpentTime
    ,TQ.[Position]  
    ,RA.SpecialNeeds        
    ,[Version] = 1 
    ,TestAdaptationId = TA.Id
    ,TaskId = TSK.TaskId
    ,TaskPosition = TT.Position
    ,QuestionRate = Q.Rate
    ,TestQuestionId = TQ.Guid
    ,AnswerType = TT.TestletAnswerTypeId
FROM 
    [TestQuestion] TQ WITH (NOLOCK)
    Join [TestTask] TT WITH (NOLOCK)            On TT.Guid = TQ.TestTaskId
    Join [Question] Q WITH (NOLOCK)         On TQ.QuestionId =  Q.QuestionId
    Join [Testlet] TL WITH (NOLOCK)         On TT.TestletId  = TL.Guid 
    Join [Test]     T WITH (NOLOCK)         On TL.TestId     =  T.Guid
    Join [TestSet] TS WITH (NOLOCK)         On T.TestSetId   = TS.Guid 
    Join [RoleAssignment] RA WITH (NOLOCK)  On TS.StudentId  = RA.PersonId And RA.RoleId = 1
    Join [Task] TSK WITH (NOLOCK)       On TSK.TaskId = TT.TaskId
    Join [Category] C WITH (NOLOCK)     On C.CategoryId = TSK.CategoryId
    Join [TimeWindow] TW WITH (NOLOCK)      On TW.Id = TS.TimeWindowId 
    Join [TestAdaptation] TA WITH (NOLOCK)  On TA.Id = TW.TestAdaptationId
    Join [TestCampaign] TC WITH (NOLOCK)        On TC.TestCampaignId = TA.TestCampaignId 
WHERE
    T.TestTypeId = 1    -- eliminuji ankety 
    And t.ProcessedOn is not null -- ne vsechny, jen dokoncene
    And TL.ShownOn is not null
    And TS.Redizo not in (999999999, 111111119)
END;

The new splitted joins after DBA great work:

    SELECT       
    TC.TestCampaignContainerId,
    TC.CategoryId As TestCampaignCategoryId,
    TC.Grade,
    TC.TestCampaignId,    
    T.TestSetId
    ,TL.TestId
    ,TL.SectionNo
    ,TL.Difficulty
    ,TestletName = Char(65+TL.SectionNo) + CONVERT(varchar(4),6 - TL.Difficulty) -- prevod na A5, B4, B5 ...
    ,TS.StudentId
    ,TS.ClassId
    ,TS.Redizo
    ,[Version] = 1 -- ? 
    ,TestAdaptationId = TA.Id
    ,TL.Guid AS TLGuid
    ,TS.TimeWindowId
INTO
    [#FinalResult_Base_1]
FROM 
    [TestSet] [TS] WITH (NOLOCK)
    JOIN [Test] [T] WITH (NOLOCK) 
        ON [T].[TestSetId] = [TS].[Guid] AND [TS].[Redizo] NOT IN (999999999, 111111119) AND [T].[TestTypeId] = 1 AND [T].[ProcessedOn] IS NOT NULL
    JOIN [Testlet] [TL] WITH (NOLOCK)
        ON [TL].[TestId] = [T].[Guid] AND [TL].[ShownOn] IS NOT NULL
    JOIN [TimeWindow] [TW] WITH (NOLOCK)
        ON [TW].[Id] = [TS].[TimeWindowId] AND [TW].[IsActive] = 1
    JOIN [TestAdaptation] [TA] WITH (NOLOCK)
        ON [TA].[Id] = [TW].[TestAdaptationId] AND [TA].[IsActive] = 1
    JOIN [TestCampaign] [TC] WITH (NOLOCK)
        ON [TC].[TestCampaignId] = [TA].[TestCampaignId] AND [TC].[IsActive] = 1
    JOIN [TestCampaignContainer] [TCC] WITH (NOLOCK)
        ON [TCC].[TestCampaignContainerId] = [TC].[TestCampaignContainerId] AND [TCC].[IsActive] = 1
    ;

 SELECT       
    FR1.TestCampaignContainerId,
    FR1.TestCampaignCategoryId,
    FR1.Grade,
    FR1.TestCampaignId,    
    FR1.TestSetId
    ,FR1.TestId
    ,TSK.CategoryId AS [TaskCategoryId]
    ,TT.[TestletId]
    ,FR1.SectionNo
    ,FR1.Difficulty
    ,TestletName = Char(65+FR1.SectionNo) + CONVERT(varchar(4),6 - FR1.Difficulty) -- prevod na A5, B4, B5 ...
    ,FR1.StudentId
    ,FR1.ClassId
    ,FR1.Redizo
    ,TT.ViewCount
    ,TT.SpentTime
    ,[Version] = 1 -- ? 
    ,FR1.TestAdaptationId
    ,TaskId = TSK.TaskId
    ,TaskPosition = TT.Position
    ,AnswerType = TT.TestletAnswerTypeId
    ,TT.Guid AS TTGuid

INTO
    [#FinalResult_Base_2]
FROM 
    #FinalResult_Base_1 FR1
    JOIN [TestTask] [TT] WITH (NOLOCK)
        ON [TT].[TestletId] = [FR1].[TLGuid] 
    JOIN [Task] [TSK] WITH (NOLOCK)
        ON [TSK].[TaskId] = [TT].[TaskId] AND [TSK].[IsActive] = 1
    JOIN [Category] [C] WITH (NOLOCK)
        ON [C].[CategoryId] = [TSK].[CategoryId]AND [C].[IsActive] = 1
    ;    

DROP TABLE [#FinalResult_Base_1]

CREATE NONCLUSTERED INDEX [#IX_FR_Student_Class]
ON [dbo].[#FinalResult_Base_2] ([StudentId],[ClassId])
INCLUDE ([TTGuid])

SELECT       
    FR2.TestCampaignContainerId,
    FR2.TestCampaignCategoryId,
    FR2.Grade,
    FR2.TestCampaignId,    
    FR2.TestSetId
    ,FR2.TestId
    ,FR2.[TaskCategoryId]
    ,FR2.[TestletId]
    ,FR2.SectionNo
    ,FR2.Difficulty
    ,FR2.TestletName
    ,TQ.[QuestionId]
    ,FR2.StudentId
    ,FR2.ClassId
    ,RA.SubjectId
    ,TQ.[QuestionPoints] -- 1+ good, 0 wrong, null no answer
    ,GoodAnswer  = Case When TQ.[QuestionPoints] Is null Then 0
                      When TQ.[QuestionPoints] > 0 Then 1 -- cookie
                      Else 0 End
    ,WrongAnswer = Case When TQ.[QuestionPoints] = 0 Then 1 
                      When TQ.[QuestionPoints] Is null Then 1
                     Else 0 End
    ,NoAnswer    = Case When TQ.[QuestionPoints] Is null Then 1 Else 0 End
    ,FR2.Redizo
    ,FR2.ViewCount
    ,FR2.SpentTime
    ,TQ.[Position] AS [QuestionPosition]  
    ,RA.SpecialNeeds -- identifikace SVP        
    ,[Version] = 1 -- ? 
    ,FR2.TestAdaptationId
    ,FR2.TaskId
    ,FR2.TaskPosition
    ,QuestionRate = Q.Rate
    ,TestQuestionId = TQ.Guid
    ,FR2.AnswerType
INTO
    [#FinalResult_Base]
FROM 
    [#FinalResult_Base_2] FR2
    JOIN [TestQuestion] [TQ] WITH (NOLOCK)
        ON [TQ].[TestTaskId] = [FR2].[TTGuid]
    JOIN [Question] [Q] WITH (NOLOCK)
        ON [Q].[QuestionId] = [TQ].[QuestionId] AND [Q].[IsActive] = 1

    JOIN [RoleAssignment] [RA] WITH (NOLOCK)
        ON [RA].[PersonId] = [FR2].[StudentId]
        AND [RA].[ClassId] = [FR2].[ClassId] AND [RA].[IsActive] = 1 AND [RA].[RoleId] = 1

    drop table #FinalResult_Base_2;

    truncate table [dbo].[FinalResult_Base];
    insert into [dbo].[FinalResult_Base] select * from #FinalResult_Base;

    drop table #FinalResult_Base;

Best Answer

1 Reduction of 'search space', coupled with better statistics for the intermediate/late joins.

I've had to deal with 90-table joins (mickey mouse design) where the Query Processor refused to even create a plan. Breaking such a join into 10 subjoins of 9 tables each, dramatically brought down the complexity of each join, which grows exponentially with each additional table. Plus the Query Optimiser now treats them as 10 plans, spending (potentially) more time overall (Paul White may even have metrics!).

The intermediate result tables will now have fresh statistics of their own, thus joining much better compared to the statistics of a deep tree that become skewed early on and end up as Science Fiction soon afterwards.

Plus you can force the most selective joins first, cutting down the data volumes moving up the tree. If you can estimate the selectivity of your predicates much better than the Optimiser, why not force the join order. Might be worth searching for "Bushy Plans".

2 It should be considered in my view, if efficiency and performance are important

3 Not necessarily, but it could be if the most selective joins are executed early on

Related Solutions

Sql-server – SHOWPLAN does not display a warning but “Include Execution Plan” does for the same query

This:

SET SHOWPLAN_XML ON;
GO
SELECT * FROM sys.objects;
GO

Is equivalent to pressing Display Estimated Execution Plan on the toolbar (or hitting Ctrl + L). You'll notice that no rows are returned from the query, like there is when you use Include Actual Execution Plan (Ctrl + M).

The spill warning is only a runtime warning. There is no way that SQL Server can know, when displaying the estimated plan, that a spill will happen at runtime. This is because a spill is caused by factors that might only be present during certain invocations of the query (for example, when there is memory pressure). The estimated plan knows roughly how much memory it's going to ask for, but it can't know until execution that it isn't going to get it.

As an aside, may I recommend* our free tool, SQL Sentry Plan Explorer? I think it provides much more obvious information than Management Studio. I recently wrote a lengthy blog post that can act as a tutorial, and Jonathan Kehayias has a great PluralSight course on it as well.

_{* Disclaimer: I work for SQL Sentry.}

Sql-server – Massive joins Vs Updating temp table

SQL Server has an upper bound on creating efficient query plans given a moderately complex query involving a lot of joins - there isn't a single upper bound or magic formula to determine when a query is complex enough to cause a problem; it is very case-by-case and involves baselining from some known expectation (people sometimes think a certain query shouldn't take 30 seconds, but sometimes that's really the best and most optimal outcome for that context). In other cases, SQL Server is simply overwhelmed and can't (at least when not given infinite time) quickly come up with an optimal plan. So it will spend some time coming up with a "best plan I can find in this time" and then it will show "Reason for early termination: timeout" and you get what you get. You can force SQL Server to spend more time coming up with a plan using trace flag 2301 (which Paul White mentions and links to a more thorough piece in this answer), but more time may not prove to help that much (and more compile time may not be a good thing for end user performance on the whole). Another alternative is to break up the logic so SQL Server can focus on optimizing smaller, less complex queries.

Think of it like eating a pie - you can probably eat a pretty good-sized piece in one sitting, but if your goal is to eat the whole pie, you might want to spread it out across multiple settings.

For SQL Server, in some cases it can do better with chunks of that logic separated. This doesn't mean you should simplify all joins into #temp table waterfalls, but in some extreme cases it is a valid workaround.

Best Answer

Related Solutions

Sql-server – SHOWPLAN does not display a warning but “Include Execution Plan” does for the same query

Sql-server – Massive joins Vs Updating temp table

Related Question