Sql-server – why are estimated rows exponentially increasing with each join

hintsplan-guidessql serversql-server-2016statistics

summary

We have an Object-Oriented database schema that we query with Hibernate with:

  • 1 parent table (NAL_Actions)
  • 7 child tables (NAL_*)

If I want to query a couple of these "action"-objects we have to join all 1+7 tables together (because we don't know what type of child object we need) which isn't great but because of the query execution plan generated by SQL Server results in very slow queries (index scans with 2,5 million rows).

details

The parent table also has a foreign key (actionGroups) that according to the fullscan statistics results in 1.15 "action"-rows on average.

This query planedit, now also on: paste-the-plan is over estimating the number of rows for each subsequent table that is joined which I and statistics know is never more than two rows as all tables are joined on their (common) distinct primary key, which is also the clustered index, which means that the number of estimated rows cannot increase, as can be seen from this definition of one of the Child-tables:

USE [NAL_PRD]
GO

/****** Object:  Table [dbo].[NAL_Unblocks]    Script Date: 6/24/2020 6:04:40 AM ******/
SET ANSI_NULLS ON
GO

SET QUOTED_IDENTIFIER ON
GO

CREATE TABLE [dbo].[NAL_Unblocks](
    [actionId] [bigint] NOT NULL,
    [unblockingAmount] [decimal](5, 2) NOT NULL,
    [productSequenceNumber] [int] NULL,
 CONSTRAINT [PK_NAL_UNBLOCKS] PRIMARY KEY CLUSTERED 
(
    [actionId] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
GO

ALTER TABLE [dbo].[NAL_Unblocks]  WITH CHECK ADD  CONSTRAINT [fk_unblocks_actions] FOREIGN KEY([actionId])
REFERENCES [dbo].[NAL_Actions] ([id])
ON DELETE CASCADE
GO

ALTER TABLE [dbo].[NAL_Unblocks] CHECK CONSTRAINT [fk_unblocks_actions]
GO

As SQL Server falsely assumes the number of rows exponentially increases with each joined table it changes the join strategy for the last 3 tables:

  • from: "clustered index seek + nested loops"
  • to: "clustered index scan + merge join"

This means that we need to scan the clustered index of a table of 2,500,000 rows instead of 1 or 2 index seeks

what I've tried so far

  1. hints work, e.g.: WITH(FORCESEEK) for every join, or: OPTION (LOOP JOIN) for the entire query, (but I cannot change the application)
  2. this leads me to suggest plan guides as the next option (for SQL Server 2016 – 13.0.5102.14) but although that works fine with sp_executesql the plan guide does not seem to take immediate effect in the application, although I took the SQL from the Query Store and updated the statistics (but maybe updating statistics was a false assumption so maybe I need to restart the application and/or drop the plan – but how without DBCC FREEPROCCACHE as I am limited to db_owner permissions, so not VIEW/ALTER SERVER STATE permissions? Maybe I should switch the application / the JDBC driver to use sp_executesql with prepareSQL=2 instead of the default (sp_prepare + sp_execute)?.

What I have not yet tried:

  • upgrading to SQL Server 2017 to try adaptive query processing, should that fix this?

Best Answer

As you're using hibernate this is going to be a bit cumbersome. You can 'guide' the optimizer by telling him how to do the joins. For example select * from a inner hash join b on a.x = b.x will tell the optimizer to do a hash join. It could also be a merge join or a loop hint.

This might result into writing the sql queries manually. Using hints should also be one of the last things to try. Your tables might change over time and the hints might not apply/be good for the performance at a later stage (mostly when switching from development to production).