Sql-server – Improve reporting stored procedure execution time – tuning temporary tables

performanceperformance-tuningquery-performancesql serversql-server-2008-r2

I've been tasked with improving the performance (and this is my first real-world performance tuning taks) of a reporting stored procedure which is called by an SSRS front-end and the stored procedure currently takes about 30 seconds to run on the largest amount of data (based on filters set from the report frontend).

This stored procedure has a breakdown of 19 queries executing in it, most of which are transforming the data from an initial (legacy) format from inside the base tables into a meaningful dataset to be displayed to the business side.

I've created a query based on a few DMV's in order to find out which are the most resource-consuming queries from the stored procedure (small snippet below) and I have found one query which takes about 10 seconds, in average, to complete.

select
    object_name(st.objectid)                                                                    [Procedure Name]
    , dense_rank() over (partition by st.objectid order by qs.last_elapsed_time desc)           [rank-execution time]
    , dense_rank() over (partition by st.objectid order by qs.last_logical_reads desc)          [rank-logical reads]
    , dense_rank() over (partition by st.objectid order by qs.last_worker_time desc)            [rank-worker (CPU) time]
    , dense_rank() over (partition by st.objectid order by qs.last_logical_writes desc)         [rank-logical write]
        ...
from sys.dm_exec_query_stats as qs
    cross apply sys.dm_exec_sql_text (qs.sql_handle) as st
    cross apply sys.dm_exec_text_query_plan (qs.plan_handle, qs.statement_start_offset, qs.statement_end_offset) as qp
where st.objectid in ( object_id('SuperDooperReportingProcedure') )
    , [rank-execution time]
    , [rank-logical reads]
    , [rank-worker (CPU) time]
    , [rank-logical write] desc

Now, this query is a bit strange in the sense that the execution plan shows that shows that the bulk of the work (~80%) is done when inserting the data into the local temporary table and not when interrogating the other tables from which the source data is taken and then manipulated. (screenshot below is from SQL Sentry Plan Explorer)

enter image description here

Also, in terms of row estimates, the execution plan has way off estimates for this, in the sense that there are only 4218 rows inserted into the local temporary table as opposed to the ~248k rows that the execution plan thinks its moving into the local temporary table. So, becasue of this, I'm thinking "statistics", but still do those even matter if ~80% of the work is the actual insert into the table?

One of my first recommendations was to re-write the entire process and the stored procedure so as to not include the moving and transforming of the data into the reporting stored procedure and to do the data transformation nightly into some persisted tables (real-time data is not required, only relevant data until end of previous day). But the business side does not want to invest time and resources into redesigning this and instead "suggests" I do performance tuning in the sense of finding where and what indexes I can add to speed this up.

I don't believe that adding indexes to base tables will improve the performance of the report since most of the time needed for running the query is saving the data into a temporary table (which from my knowledge it will hit tempdb, which means that they will be written to disk -> increased time due to I/O latency).

But, even so, as I've mentioned this is my first performance tuning task and I've tried to read as much as possible related to this in the last couple of days and these are my conclusions so far, but I'd like to ask for advice from a broader audience and hopefully get a few more insights and understanding on what I can do to improve this procedure.

As a few clear questions I'd appreciate if could be answered are:

  • Is there anything incorrect in what I have said above (in my understanding of the db or my assumptions) ?
  • Is it true that adding an index to a temporary table will actually increase the time of execution, since the table (and its associated index(es) is/are being rebuilt on each execution)?
  • Could there anything else be done in this scenario without having to re-write the procedure / queries and only be done via indexes or other tuning methods? (I've read a few article headlines that you could also "tune tempdb", but I didn't get into the details of those, yet).

Any help is very much appreciated and if you need more details I'll be happy to post.

Additional details:

The query in question is (partially) below. What is missing are a few more aggregate columns and their corresponding lines in the GROUP BY section:

select
    b.ProgramName
    ,b.Region
    ,case when b.AM IS null and b.ProgramName IS not null 
        then 'Unassigned' 
        else b.AM 
    end as AM
    ,rtrim(ltrim(b.Store)) Store
    ,trd.Store_ID
    ,b.appliesToPeriod
    ,isnull(trd.countLeadActual,0) as Actual
    ,isnull(sum(case when b.budgetType = 0 and b.budgetMonth between @start_date and @end_date then b.budgetValue else 0 end),0) as Budget
    ,isnull(sum(case when b.budgetType = 0 and b.budgetMonth between @start_date and @end_date and (trd.considerMe = -1 or b.StoreID < 0) then b.budgetValue else 0 end),0) as CleanBudget
    ... 
into #SalvesVsBudgets
from #StoresBudgets b
    left join #temp_report_data trd on trd.store_ID = b.StoreID and trd.newSourceID = b.ProgramID
where (b.StoreDivision is not null or (b.StoreDivision is null and b.ProgramName = 'NewProgram'))
    group by
        b.ProgramName
        ,b.Region
        ,case when b.AM IS null and b.ProgramName IS not null 
            then 'Unassigned' 
            else b.AM 
        end
    ,rtrim(ltrim(b.Store))
    ,trd.Store_ID
    ,b.appliesToPeriod
    ,isnull(trd.countLeadActual,0)

I'm not sure if this is actually helpful, but I added the information, just in case:

  • the temporary tables have no indexes on them
  • RAM size: 32 GB

I have tried to move the CASE statements from the aggregate-generating query and unfortunately, overall, the procedure time has not improved, noticeably, as it still fluctuates in the range of ±0.25 to ±1.0 second (yes, both lower and higher time than the original version of the stored procedure – but I'm guessing this is due to variable workload on my machine).

The execution plan for the same query, but modified to remove the CASE conditions, leaving only the SUM aggregates, is now:

enter image description here

Best Answer

You might be able to do a union instead of the or. That could prevent a table scan.

select
    b.ProgramName
    ,b.Region
    ,case when b.AM IS null and b.ProgramName IS not null 
        then 'Unassigned' 
        else b.AM 
    end as AM
    ,rtrim(ltrim(b.Store)) Store
    ,trd.Store_ID
    ,b.appliesToPeriod
    ,isnull(trd.countLeadActual,0) as Actual
    ,isnull(sum(case when b.budgetType = 0 and b.budgetMonth between @start_date and @end_date then b.budgetValue else 0 end),0) as Budget
    ,isnull(sum(case when b.budgetType = 0 and b.budgetMonth between @start_date and @end_date and (trd.considerMe = -1 or b.StoreID < 0) then b.budgetValue else 0 end),0) as CleanBudget
    ... 
into #SalvesVsBudgets
from #StoresBudgets b
    left join #temp_report_data trd on trd.store_ID = b.StoreID and trd.newSourceID = b.ProgramID
where (b.StoreDivision is not null)
    group by
        b.ProgramName
        ,b.Region
        ,case when b.AM IS null and b.ProgramName IS not null 
            then 'Unassigned' 
            else b.AM 
        end
    ,rtrim(ltrim(b.Store))
    ,trd.Store_ID
    ,b.appliesToPeriod
    ,isnull(trd.countLeadActual,0)

Union

select
    b.ProgramName
    ,b.Region
    ,case when b.AM IS null and b.ProgramName IS not null 
        then 'Unassigned' 
        else b.AM 
    end as AM
    ,rtrim(ltrim(b.Store)) Store
    ,trd.Store_ID
    ,b.appliesToPeriod
    ,isnull(trd.countLeadActual,0) as Actual
    ,isnull(sum(case when b.budgetType = 0 and b.budgetMonth between @start_date and @end_date then b.budgetValue else 0 end),0) as Budget
    ,isnull(sum(case when b.budgetType = 0 and b.budgetMonth between @start_date and @end_date and (trd.considerMe = -1 or b.StoreID < 0) then b.budgetValue else 0 end),0) as CleanBudget
    ... 
into #SalvesVsBudgets
from #StoresBudgets b
    left join #temp_report_data trd on trd.store_ID = b.StoreID and trd.newSourceID = b.ProgramID
where  (b.StoreDivision is null and b.ProgramName = 'NewProgram')
    group by
        b.ProgramName
        ,b.Region
        ,case when b.AM IS null and b.ProgramName IS not null 
            then 'Unassigned' 
            else b.AM 
        end
    ,rtrim(ltrim(b.Store))
    ,trd.Store_ID
    ,b.appliesToPeriod
    ,isnull(trd.countLeadActual,0);

Additionally you might be getting forced into a single thread execution plan by the trim functions, if you are able to do the ltrim and rtrim in the application that is consuming the data instead of in the query you might be able to get an execution plan that goes parallel.