I want to take aggregate data for each sales person employed and display the Salesname
, saleamount
, and saleshippingamount
. I have the query set to use 2 UPDATE
CTE statements to update a table that house the master data for each employee. I am updating because I need aggregate totals to be displayed.
The syntax takes roughly 10 minutes to complete and between the two tables there are roughly 60,000 rows to process. Is a CTE The wrong way to go about this?
Below is sample DDL. The employee names are populated from the HR table, but I did not show DDL for this, just an insert statement.
Create Table #FinalData
(
employee varchar(400)
,tr float
,tf float
)
Create Table #TR
(
employee varchar(400)
,saleamount float
,saledate date
)
Create Table #TF
(
employee varchar(400)
,saleshippingamt float
,saledate date
)
INSERT INTO #TR (employee, saleamount, saledate) VALUES
('Employee 1', '12.63', '2017-01-01'), ('Employee 1', '15.00' ,'2017-01-02')
,('Employee 2', '14.00', '2017-01-03'), ('Employee 2' ,'12.00', '2017-01-03')
,('Employee 3', '16.00', '2017-01-03'), ('Employee 3', '13.00', '2017-01-04')
INSERT INTO #TF (employee, saleshippingamt, saledate) VALUES
('Employee 1', '5.00', '2017-01-01'), ('Employee 1', '6.00' ,'2017-01-02')
,('Employee 2', '5.50', '2017-01-03'), ('Employee 2' ,'5.00', '2017-01-03')
,('Employee 3', '6.00', '2017-01-03'), ('Employee 3', '6.00', '2017-01-04')
INSERT INTO #FINALDATA (employee) Values
('Employee 1'), ('Employee 2'), ('Employee 3')
;With TR As
(
SELECT
employee
,SUM(saleamount) Totals
FROM #TR
WHERE saledate BETWEEN '2017-01-01' AND '2017-01-25'
GROUP BY employee
)
UPDATE t
SET tr = t2.TotalCount
FROM #FinalData t
INNER JOIN (Select employee, SUM(Totals) TotalCount
FROM TR
GROUP BY employee) As t2
ON t.employee = t2.employee
;With TF As
(
SELECT
employee
,SUM(saleshippingamt) Totals
FROM #TF
WHERE saledate BETWEEN '2017-01-01' AND '2017-01-25'
GROUP BY employee
)
UPDATE t
SET tf = t2.TotalCount
FROM #FinalData t
INNER JOIN (Select employee, SUM(Totals) TotalCount
FROM TF
GROUP BY employee) As t2
ON t.employee = t2.employee
Select * FROM #FinalData
And this is my execution plan.
What is an option that I have to optimize this query?
Best Answer
I've finally been able to do some testing and playing with the query and can give you more concrete suggestions. I used AdventureWorks for my sample so I had some actual data to work with.
Option 1 - Add Indexes
This will give you the best performance increase for the least amount of effort. I saw an increase of 40% in query execution in my sample set just by adding three indexes.
Tables
#TR
and#TF
. Add an index on thesaledate
,employee
, and include the amount.for the
#FinalData
, you should have a clustered index on theemployee
field.Option 2 - Change your query
Your query could use some tweaking, it may even allow you to get rid of the temp table all together.
This query also performs a little faster, but the indexes will give you the best increase in performance.
General Comments
You can name CTEs however you like, but it does confuse things a little bit if you give them the same name as the table they represent. I like to give them a
CTE_*
prefix, but it's optional.You can also chain CTEs as I've done here.
In your original query, you were doing two aggregate operations when you were getting the totals for the
totalcount
in the#FinalData
. That would also be something to keep an eye out for, you can see that it's unnecessary, but SQL will do the sort and sum operation multiple times. You could just join the CTE table on instead of doing a sub-select.You Had
Should be