SQL Server – Query Optimization for CTE with OPENQUERY

postgresql-9.3sql-server-2008-r2t-sql

I am running the below query that utilizes a CTE – the full recordset is roughly 10K records. When I do a Select Top 1000 the query takes roughly 45 seconds to return the results. I am querying a postgresql database from sql server 2008

Is there a quicker way of writing this query?

Select [Employee Name] = employeename,[Customer Number] = Customernumber
,AVG(quantityordered) As AverageQty,AVG(Price) As AveragePrice
,[Sale Date] = CAST(Saledate As Date)
FROM OPENQUERY(192.155.10.22,'with MaxNumber as
                      (   
                        select max(cc2) as "MCN"   
                        ,Customernumber    
                        from employeeinfo    
                        group by Customernumber
                      )
                      Select
                      c1.employeename,c1.Customernumber
                      ,c1.Saledate,ep.saledc1cription
                      ,eq.quantityordered,eq.price
                      FROM MaxNumber c2
                      INNER JOIN Customer c1
                      ON c2.Customernumber=c1.Customernumber
                      AND COALESCE(c2."MCN",0)=COALESCE(c1.cc2,0)
                      INNER JOIN orderpart ep
                      ON ep.Customer = c1.Customerid
                      INNER JOIN orderquantity eq
                      ON eq.employeename = ep.employeename
                      INNER JOIN employee cm
                      ON c1.employee = cm.employee')
WHERE CAST(Saledate As Date) >= '01/01/2017'
GROUP BY customername, Customernumber, Saledate

EDIT
There are no indexes on the tables, and I am unable to modify them to add any. And this is the execution plan
Query Execution Plan

Best Answer

While I haven't used OPENQUERY, I'll add some - more or less common sense - notes:

  • You are not doing any joins between the remote (Postgres) server tables and the SQL Server ones. I suggest you move the whole query inside the Openquery call.

  • The COALESCE() suggests that the 2 columns are nullable? Are they?
    If not, remove the COALESCE() calls which prohibit the use of indexes. If they are nullable, you can try to modify the predicate to something like (if they can be 0, it'd have to be more complex):

    AND ( c2."MCN" = c1.cc2  OR  c2."MCN" IS NULL AND c1.cc2 IS NULL )
    
  • The CAST(Saledate As Date) >= '01/01/2017' looks like a killer, too. There is no reason to use CAST. The expression can be written as (the cast may not be needed at all in Postgres, depending on the column type):

    Saledate >= CAST('2017-01-01' AS date)
    
  • After pushing the whole query to the linked server, there is no reason to have the WHERE and GROUP BY in the external query. You can collapse the query into one level.

  • After the rewrite, and if it is still slow, check the execution plan in the remote server. Besides indexes, examine the plan regarding the CTE. CTEs in Postgres are optimization fences, and their use restricts the optimizer into certain options - which may be good or bad for efficiency - depending on too many variables.


The query, in Postgres syntax:

SELECT *
FROM OPENQUERY(192.155.10.22, '
    WITH MaxNumber AS
      (   
        SELECT 
            MAX(cc2) AS "MCN"   
            Customernumber    
        FROM employeeinfo    
        GROUP BY Customernumber
      )
    SELECT 
        c1.employeename           AS "Employee Name",
        c1.Customernumber         AS "Customer Number",
        AVG(eq.quantityordered)   AS "AverageQty",
        AVG(eq.price)             AS "AveragePrice",
        CAST(c1.Saledate AS Date) AS "Sale Date"
    FROM
        MaxNumber c2
      INNER JOIN Customer c1
        ON  c2.Customernumber = c1.Customernumber
        AND COALESCE(c2."MCN", 0) = COALESCE(c1.cc2, 0)
      INNER JOIN orderpart ep
        ON  ep.Customer = c1.Customerid
      INNER JOIN orderquantity eq
        ON  eq.employeename = ep.employeename
      INNER JOIN employee cm
        ON  c1.employee = cm.employee
    WHERE 
        Saledate >= CAST('2017-01-01' AS date)
    GROUP BY 
        c1.customername, 
        c1.Customernumber, 
        CAST(c1.Saledate AS date)
')

(will still need fixing, there is a mismatch between the openquery which has select employeename and the external query which has group by customername)