Sql-server – Performance tuning a cascading delete with many foreign keys

deleteforeign keyperformanceperformance-tuningsql serversql-server-2008-r2

I have a delete query that is taking a long time. Looking at the execution plan, I see that most of the estimated cost in the delete query is in a section of the data model that had a lot of data (say, 400k rows) which seemed fine, but I don't understand one thing.

Stripped down view of data model:

table ParentObject 
      int parentObjectId (PK)

table Child
      int childId (PK)
      int parentId (FK)
      <stuff>

table GrandChild
      int grandChildId (PK)
      int childId (FK)
      <more stuff>

Where a parent object might have 200,000 Children, and a Child has 2 or so GrandChildren. I am interested in tuning the performance of:

DELETE FROM ParentObject WHERE parentObjectId = %d;

On Grandchild, there is an extra nonclustered index on (childId, + two other columns) as well as the primary key index. On child there is an extra nonclustered unique index (parentId, + two other columns).

The thing that I saw in the query plan is that while deleting the Grandchild objects, there were two expensive sorting operations mixed in with the deletions, and I don't understand why they are there.

What should I be looking at to help this delete operation go faster? Does it need to sort? Would it help if I denormalized the ids and added a parent Id to the grandchild table? Did I build my index stupidly?

The full execution plan is here.

Best Answer

To answer your main question directly, the sorts are there to present rows to update operators (performing deletions in this case) in index key order. The principle at work here is that sorting on the keys will promote sequential access to the index.

This can be a good optimization, though the details depend on your hardware, how likely the affected pages are to be in memory, and whether the sorts can complete within the memory allocated to them. When the optimizer decides the cost of sorting will be paid back by the increased efficiencies associated with sequential index access, it sets a property DMLRequestSort on the update operator:

DML Request Sort

The optimizer may also decide to split the update into separate operators to maintain the clustered index (or heap) and then the nonclustered indexes. often, it will decide to sort more than once - first for the clustered index keys, and then again for the nonclustered index(es). Again, where sorting is considered optimal, each index update operator will have the DMLRequestSort property set to true.

All that said, the things I would fix first would be to eliminate the index scans where the join operator they feed is a nested loops join, and to remove the eager index spools, which are inserting rows into an empty index every time the query is executed. An eager index spool is often the clearest possible sign that you are missing a useful permanent index. The seek predicate in the index spool operator identifies the keys the optimizer would like an index on.

Examples of tables that are missing a nonclustered index (requiring an eager index spool) are:

child6gc8Selections
gc9s
child7s
gc6s

eager index spools

Examples of tables that are currently being scanned below a nested loops join are:

child1
parentObjectMessages
child8s
child7s
child6s
child5s
child4s
child3s
child2s

scan below nested loops

Taking the example shown above, the Clustered Index Scan has an output list of Id, parentObjectId, the Nested Loops Join predicate is child7s.parentObjectId = parentObject.Id, and the join output column list is child7s.Id.

From that information, it seems a good access method (index) on child7s for this part of the query would be keyed on parentObjectId with Id as an included column. You should be able to work out how best to work this into your existing indexing strategy.

The following are examples of tables where the optimizer is currently choosing a hash join. I would check tables like this to ensure that is a reasonable access method:

child6gc8Selections
gc2s
gc5s
gc6Properties

hash joins

The table child2bigChild also participates in a merge join where an explicit sort is necessary. Again, I would check to see if this sort could be avoided.

sort before merge

Once the basic indexing issues are resolved, we can look at other optimizations if necessary.

Related Solutions

Sql-server – Query tuning – performance

You should look at the Execution Plan and see where most of the cost is for the query. Without the results of the Execution Plan it is difficult to offer advice.

I suspect the query optimizer is using the best process possible for the query as it is structured. You might try breaking the query into smaller parts, like query the email field to get the results into a temp table first, then run your main query using the temp table of emails.

The might not make the query run faster, but this might expose information in the Execution Plan that will help you troubleshoot.

Another option is to use include to add columns to your non-clustered index to return all the values in the index scan to maybe speed up the process.

Sql-server – Considerations for performance comparison with a high fragmented heap

Since the three table copies are brandnew, so there is no fragmentation in place. For a fair comparison I also re-builded all indexes of the original table.

A more realistic test would be to try recreate the fragmentation that would have resulted from the table being in each design from the start. This way you are comparing the result of the designs as they would look after real world use rather than after a fresh rebuild.

If your application keeps a full audit trail for that data then you could perhaps rebuild each copy by replying that audit. Otherwise you might need to make up some heuristic (for instance if the data includes a creation or last modified date inserting the rows into each copy in that order).

One thing to note when doing this is to do each insert individually rather than a block copy - when you insert or update many rows at once it is bright enough to bunch the index updates to reduce page splits which it can't do with the individual inserts that would result from real application access. The following will illustrate this (using the amount of space allocated as an indication of free page space fragmentation due to page splits caused by the randomness of UUID ordering):

SET NOCOUNT ON
CREATE DATABASE TestFragUUID
GO
USE TestFragUUID
CREATE TABLE IndividualInserts (ID UNIQUEIDENTIFIER PRIMARY KEY CLUSTERED, AnotherID UNIQUEIDENTIFIER UNIQUE)
CREATE TABLE SingleLargeInsert (ID UNIQUEIDENTIFIER PRIMARY KEY CLUSTERED, AnotherID UNIQUEIDENTIFIER UNIQUE)
GO
INSERT IndividualInserts SELECT NEWID(), NEWID()
GO 100000
WITH CTen AS (SELECT TOP 10 name FROM sys.objects)
INSERT SingleLargeInsert 
SELECT NEWID(), NEWID()
FROM   CTen unit
CROSS JOIN cTen ten
CROSS JOIN cTen hun
CROSS JOIN cTen tho
CROSS JOIN cTen tth
GO
EXEC sp_spaceUsed 'IndividualInserts'
EXEC sp_spaceUsed 'SingleLargeInsert'
GO
USE master
DROP DATABASE TestFragUUID

Best Answer

Related Solutions

Sql-server – Query tuning – performance

Sql-server – Considerations for performance comparison with a high fragmented heap

Related Question