Sql-server – Adding an INNER JOIN ruins query performance due to different execution plan despite updated STATISTICS and RECOMPILE, why

execution-planindexsql server

In the home-page of my multi-tenant web application, Linq-to-Entities (EF6) is used to get a paged list of Calendar Events. This data is JOINed with some other tables to produce this query, below.

The tables all have composite primary keys that combine their TenantId with their own IDENTITY. I put the IDENTITY column first in clustered index because STATISTICS only looks at the first column in an index, and I understand having the most-specific column first improves performance of get-single-row queries. (If I'm wrong about this then please let me know!).

The schema is this:

CREATE TABLE dbo.Events (
    EventId     int NOT NULL IDENTITY(1,1),
    TenantId    int NOT NULL,
    CustomerId  int     NULL,
    EventTypeId int NOT NULL,
    LocationId  int NOT NULL,
    Etc...

    CONSTRAINT PRIMARY KEY ( EventId, TenantId ),
    
    CONSTRAINT FOREIGN KEY FK_Events_Tenants ( TenantId ) REFERENCES dbo.Tenants ( TenantId ),
    
    CONSTRAINT FOREIGN KEY FK_Events_Customers ( CustomerId, TenantId ) REFERENCES dbo.Customers ( CustomerId, TenantId ),
    
    CONSTRAINT FOREIGN KEY FK_Events_EventTypes ( EventTypeId, TenantId ) REFERENCES dbo.EventTypes ( EventTypeId, TenantId ),
    
    CONSTRAINT FOREIGN KEY FK_Events_Locations ( LocationId, TenantId ) REFERENCES dbo.Locations ( LocationId, TenantId ),
)

The query that Linq generates is this:

(reformatted for readability, e.g. indentation, renaming Linq's Extent1 alias to shorter aliases based on the table's name, removing redundant column aliases, and removing Linq's idiosyncratic Project1, these changes did not affect the execution plan)

SELECT
    e.(etc),
    c.(etc),
    l.(etc),
    t.(etc)
FROM
    dbo.Events AS e
    LEFT OUTER JOIN dbo.Customers AS c ON
        c.TenantId = e.TenantId
        AND
        c.CustomerId = e.CustomerId
    INNER JOIN dbo.Locations AS l ON
        l.TenantId = e.TenantId
        AND
        l.LocationId = e.LocationId
    INNER JOIN dbo.EventTypes AS t ON
        t.TenantId = e.TenantId
        AND
        t.TypeId = e.TypeId
WHERE
    e.TenantId = @tenantId

ORDER BY
    ROW_NUMBER() OVER ( ORDER BY e.EventId DESC, e.TenantId DESC )
OFFSET
    @offset ROWS FETCH NEXT @pageSize ROWS ONLY
  • The @tenantId parameter is 123.
  • The @offset parameter is 0.
  • The @pageSize parameter is 50.
  • I ran the query before and after a rebuild of all indexes, as well as updating all STATISTICS with sp_updatestats @resample = 'resample'. This had no effect on the execution plan.
  • I also ran them with OPTION(RECOMPILE). This also had no effect on the execution plan.
  • This query is being run on Azure SQL, btw.

Recently (ever since a few weeks ago) this query was running very slowly, so I got the Execution Plan for that query from SSMS:

enter image description here

  • I have circled all execution plan nodes that pass 41,109 rows around.
  • 41,109 is the number of rows in my dbo.Events table that correspond to the TenantId account I was looking at.
    • So every time I saw 41,109 in the plan I know that it was reading all dbo.Events rows – or index nodes – for that Tenant.
    • But the query has paging via OFFSET ROWS FETCH ROWS with an ORDER BY that uses the Clustered Index key columns – so it should not be reading more than 50 rows from the dbo.Events table!
  • I've uploaded the Execution Plan to Brent Ozar's PasteThePlan website: https://www.brentozar.com/pastetheplan/?id=Hyc1bdkEO

I saw that the plan was reading from the dbo.EventTypes table first, and using the output of that read to filter rows from dbo.Events – but that's not my intention. My intention is for dbo.Events to be the "primary" table, and for it to get rows from the other tables (dbo.Customers, dbo.EventTypes, and dbo.Locations) based on the rows it read from dbo.Events.

When I removed the INNER JOIN dbo.EventTypes AS t ON... part of the query (and the concordant columns) and re-ran the query, the execution plan was now what I intended it to be and the query ran very fast (except without the dbo.EventTypes data that I still need…):

SELECT
    e.(etc),
    c.(etc),
    l.(etc)
FROM
    dbo.Events AS e
    LEFT OUTER JOIN dbo.Customers AS c ON
        c.TenantId = e.TenantId
        AND
        c.CustomerId = e.CustomerId
    INNER JOIN dbo.Locations AS l ON
        l.TenantId = e.TenantId
        AND
        l.LocationId = e.LocationId
WHERE
    e.TenantId = @tenantId

ORDER BY
    ROW_NUMBER() OVER ( ORDER BY e.EventId DESC, e.TenantId DESC )
OFFSET
    @offset ROWS FETCH NEXT @pageSize ROWS ONLY

The execution plan now shows it reading exactly 50 rows from the dbo.Events table. (Curiously, it also shows it reading from the dbo.Locations table and indexes twice – I don't understand why).

enter image description here

The Execution Plan window does not indicate that any indexes are missing for either query. When dbo.EventTypes is used in the query then the execution-plan shows the Index Seek node on dbo.Events is using the TenantId as the only seek predicate and ignoring the EventId predicate that would be used because of the ORDER BY clause.

Unfortunately none of the online resources I've found that advise improving execution plans by judicious use of indexes helped me in this cause because, as far as I know, the dbo.Events table has plenty of coverage by indexes (there's more than 15 indexes on the table) and none of them talked about how indexes are used by ORDER BY, TOP, and OFFSET. Additionally they all advised that incorrect estimated row counts (like how I see 2616 estimated but 41109 actual, or 77192 estimated but 41109 actual) would be rectified by updating STATISTICS, but I did update all statistics and that didn't help at all.

Best Answer

Try building it with the clustered index keys in the reverse order - so tenantid first. for the inner joins add the

  l.TenantId = @tenantId 

to the join criteria

change your order by from

ORDER BY
    ROW_NUMBER() OVER ( ORDER BY e.EventId DESC, e.TenantId DESC )

to

ORDER BY e.EventId DESC, e.TenantId DESC

although with the reversed order you would be better served with

ORDER BY e.TenantId DESC, e.EventId DESC

if that is still acceptable If you still get a sort consider building the PK with DESC order as well