Sql-server – Update many-to-many relationship table

performancesql serversql-server-2008-r2

I have three tables: task, taskmembers, members. Each task may have up to five people associated to it. Currently, when someone updates a task I delete all the current members and insert them as a new record. Otherwise, I have to figure out if the person is still a member of the task and, if not, delete them. Also, I have to delete the people no longer associated with the task and update the people still associated to the task.

My concern is that I'm currently using an int as my primary key within the members table. I know MS SQL int values can go up to 2,147,483,647 before overflowing, but it's still a concern of mine. Also, is deleting all the records a good practice? Thank you in advance.

Best Answer

The first thing you'll want to do is make sure there is an alternate key (usually via a unique constraint) defined in the taskmembers table, that consists of the task id and member id. This key will drive all data manipulation, and the surrogate key won't be involved at all.

Once you have that key in place, I suggest using the MERGE statement to perform data manipulation. This will avoid having to delete a subset of data and re-insert it. Instead, by matching the source and target data based on the alternate key, incremental insert/updates can take place. The MERGE statement also allows you to delete from the target where records don't exist in the source (i.e., a member of the task was removed).

The DELETE/INSERT pattern is really wasteful of resources because of the extra amount of logging involved, and the unnecessary locks that are required while the data manipulation occurs. If this is a "hot" table, this kind of pattern can easily lead to deadlocks (if not using snapshot isolation), and as you're finding out, wasted key space usage. With the MERGE pattern, new keys would only be required when members are actually added to tasks.