SQL Server 2016 – Optimizing Single-Row DELETEs with Foreign Keys

deleteforeign keysql serversql-server-2016

In a database I am working on, there is a table called persons with about one million rows, and 60 FK (FK) constraints from other tables (some with many millions rows) pointing to it.

If I delete one row from persons, it takes many minutes, which would not be a problem, but it also keeps the table locked, thus blocking the database for all processes. In the past, this has caused users to report that the system is down.

Of course the situation would greatly improve if I added supporting indexes for all the FKs (currently, only 20 of the 60 tables have it). But many of these FKs are for columns like modified_by, so all the indexes would have no other purpose, and would degrade performance of the system in the daily operations, only to gain an improvement in an exceptional case.

Before running the DELETE, I already make sure that all the referencing rows have been deleted or updated. I do it manually, since I am strongly against the use of CASCADE.

I am not considering soft-delete because otherwise I would have to change all software that reads the persons table to skip deleted rows.

Question

Is there a way to (possibly temporarily) change the locking mechanism of the persons table, so that even if the DELETE takes an hour, it does not impact concurrent processes?

Here's the anonymized execution plan.

Considerations

Disabling FKs might be a possibility. The risk is that someone else creates inconsistency while I delete the row, then I cannot re-enable the FKs.
Rows to delete: typically one at a time. Either manual operation or scheduled.
Interesting point: I did not check the execution plan immediately, but apparently, almost all operations cost 0%, except 7 cases of "Clustered index Scan (Clustered)" on the PK of referencing tables; one of them costs 57%, the other from 1% to 16%. I still don't understand why it should scan a clustered index.

Best Answer

the situation would greatly improve if I added supporting indexes for all the FK (currently, only 20 of the 60 tables have it)

You should always index fields used as keys, foreign or otherwise. This gives the database far more options when analysing how to execute the query and adding them may actually increase the overall performance of your application.

many of these FK are for columns like "modified by", so all the indexes would have no other purpose

As with [pretty much] everything on Planet Database, it's a compromise.
Here, you're balancing data integrity - knowing with certainty who changed something - against speed - not having those extra indexes that slow down inserts. Does it really matter if you don't know (with certainty) the user that changed these records? If not, you're OK as you are. If it does matter - and in lots of cases, it really, really does - then you're stuck with having the overhead of the indexes.

Before running the DELETE, I already make sure that all the referencing records have been deleted ...

With the FK's in place, you would not be able to delete them anyway, so the values of this activity is dubious.

The question really should be, why is the delete taking an hour?
If you're only deleting one record out of a million row table and there are no rows in the tables with FK's back to this table, then what's taking the time? It should be practically instantaneous.

Slow queries, even deletes, are often caused by missing indexes - are you performing the delete based on the record's Primary Key field? Or some other combination of [not-indexed] fields?

Related Solutions

Sql-server – Performance tuning a cascading delete with many foreign keys

To answer your main question directly, the sorts are there to present rows to update operators (performing deletions in this case) in index key order. The principle at work here is that sorting on the keys will promote sequential access to the index.

This can be a good optimization, though the details depend on your hardware, how likely the affected pages are to be in memory, and whether the sorts can complete within the memory allocated to them. When the optimizer decides the cost of sorting will be paid back by the increased efficiencies associated with sequential index access, it sets a property DMLRequestSort on the update operator:

DML Request Sort

The optimizer may also decide to split the update into separate operators to maintain the clustered index (or heap) and then the nonclustered indexes. often, it will decide to sort more than once - first for the clustered index keys, and then again for the nonclustered index(es). Again, where sorting is considered optimal, each index update operator will have the DMLRequestSort property set to true.

All that said, the things I would fix first would be to eliminate the index scans where the join operator they feed is a nested loops join, and to remove the eager index spools, which are inserting rows into an empty index every time the query is executed. An eager index spool is often the clearest possible sign that you are missing a useful permanent index. The seek predicate in the index spool operator identifies the keys the optimizer would like an index on.

Examples of tables that are missing a nonclustered index (requiring an eager index spool) are:

child6gc8Selections
gc9s
child7s
gc6s

eager index spools

Examples of tables that are currently being scanned below a nested loops join are:

child1
parentObjectMessages
child8s
child7s
child6s
child5s
child4s
child3s
child2s

scan below nested loops

Taking the example shown above, the Clustered Index Scan has an output list of Id, parentObjectId, the Nested Loops Join predicate is child7s.parentObjectId = parentObject.Id, and the join output column list is child7s.Id.

From that information, it seems a good access method (index) on child7s for this part of the query would be keyed on parentObjectId with Id as an included column. You should be able to work out how best to work this into your existing indexing strategy.

The following are examples of tables where the optimizer is currently choosing a hash join. I would check tables like this to ensure that is a reasonable access method:

child6gc8Selections
gc2s
gc5s
gc6Properties

hash joins

The table child2bigChild also participates in a merge join where an explicit sort is necessary. Again, I would check to see if this sort could be avoided.

sort before merge

Once the basic indexing issues are resolved, we can look at other optimizations if necessary.

Sql-server – Optimizing an audit table for inserts and deletes

Typically to optimize inserts/deletes for such a table you would cluster on the datetime column. (You are tracking when these events happen, I presume.)

This way, inserts go to the end of the table always, minimizing the "bad" kind of page splits. Deleting or archiving data is easy because the clustered index will support range deletes, and these operations will not typically be locking the pages where you're performing inserts (unless you have escalation for some reason).

You will probably want other indexes like what you're describing to support queries, and maintenance of these of course will interfere somewhat with DML.

I'm not sure I understand why you only want to keep a certain number of rows for any UserId/EventId combination. If one user does a whole bunch of stuff today, do you really want to remove all the stuff they did yesterday, last week, etc., while you keep older data for users that are less active? Wouldn't it make more sense to just keep data based on time (e.g. retain two weeks of history)?

I'm also not sure why the delete absolutely needs to be coupled to the insert. Can't the delete be deferred to a background process instead of holding up the insert transaction?