While doing some pre-checking on a table before adding a new Foreign Key, we were querying to see how many current rows would violate the new key. This is a fairly active database with near constant inserts on the tables in question.
The FK would be from MessagePatientIdentifier.MessageID to Message.MessageID.
The query we used was as simple as could be:
select * from MessagePatientIdentifier as mpi
where
MessageID not in (select MessageID from Message)
The issue that we are seeing, is that rows are being returned from MessagePatientIdentifier
MessagePatientID | MessageID
553bde76-47d4-4ec3-96d1-b5d2e98931e1 | 7d45464d-8cc4-4a2e-8828-020722165b39
in this case and when you then select * from Message where MessageID = 7d45464d-8cc4-4a2e-8828-020722165b39
, the record does in fact exist.
We then moved on and tried the same query, but on a table that DOES have a FK in place, referencing Message table in the same way. The same results…the query is reporting that Child table records exist without parent the corresponding parent table (Message) record.
MessageRecipID | MessageID
26d6d632-87b3-407e-aeb0-04552981e5f8 | 750f0fb4-3e6c-485d-996e-f061f8caa360
Then, once again if you select * from Message where MessageID = 750f0fb4-3e6c-485d-996e-f061f8caa360
, this returns the record.
This data comes in via stored procedures from a Mirth Server as well as through a BizTalk WCF-SQL send port. The proc inserts the message record, gets the new uniqueidentifier
as an output variable and then uses that to call auxiliary stored procs to insert into both the MessagePatient.MessageID
and MessageRecip.MessageID
.
Is this expected behaviour and I'm just not educated on the inner workings of SQL? Technically I believe BizTalk runs everything in a transaction, so it shouldn't get out of sync, even if it did where would it get the MessageID value if not from the Message insert?
What am I missing here?
Best Answer
If you can restore a static backup somewhere, and do analysis on that unchanging copy, I think you'll find that the weird behavior you're seeing goes away.
Read Committed
The default isolation level in SQL Server is READ COMMITTED. Under this isolation level, you read the data that is currently committed to the database. The only guarantee that it provides is that when SQL Server reads a given page, it will never return an uncommitted result (known as a dirty read).
In your case you're scanning two tables and comparing them. While SQL Server is doing that scan, there is data movement.
MessageSubject
or some other column is updated to a longer value, and this results in a page split. Or perhaps the key column is updated and the row is physically moved "up" the index into the portion you already read.Your examples use integers, but in your question you mention
uniqueidentifier
. Sinceuniqueidentifier
s are random, the randomness means rows are constantly being inserting into random places on the table. This increases the chances that a page fills up and has to be split to accommodate the new page, and increases the chances you see these phenomena.Read the section titled "Locking Read Committed Behaviours" in the above linked article for a thorough explanation.
The fix?
If you want to avoid these off phenomena, don't use the default isolation level in SQL Server. My favorite & default isolation level is READ COMMITTED SNAPSHOT, which avoids these problems. (Read about all the isolation levels here to decide which one is right for you.)
Or if it's just for a one-time analysis, you could use a database snapshot to create a static image to query, or you could stop writing to the database, or restore a copy elsewhere. Stopping writes will stop data movement, and you won't run into the problem.