You don't need 30 join conditions for a FULL OUTER JOIN
here.
You can just Full Outer Join on the PK, preserve rows with at least one difference with WHERE EXISTS (SELECT A.* EXCEPT SELECT B.*)
and use CROSS APPLY (SELECT A.* UNION ALL SELECT B.*)
to unpivot out both sides of the JOIN
ed rows into individual rows.
WITH TableA(Col1, Col2, Col3)
AS (SELECT 'Dog',1,1 UNION ALL
SELECT 'Cat',27,86 UNION ALL
SELECT 'Cat',128,92),
TableB(Col1, Col2, Col3)
AS (SELECT 'Dog',1,1 UNION ALL
SELECT 'Cat',27,105 UNION ALL
SELECT 'Lizard',83,NULL)
SELECT CA.*
FROM TableA A
FULL OUTER JOIN TableB B
ON A.Col1 = B.Col1
AND A.Col2 = B.Col2
/*Unpivot the joined rows*/
CROSS APPLY (SELECT 'TableA' AS what, A.* UNION ALL
SELECT 'TableB' AS what, B.*) AS CA
/*Exclude identical rows*/
WHERE EXISTS (SELECT A.*
EXCEPT
SELECT B.*)
/*Discard NULL extended row*/
AND CA.Col1 IS NOT NULL
ORDER BY CA.Col1, CA.Col2
Gives
what Col1 Col2 Col3
------ ------ ----------- -----------
TableA Cat 27 86
TableB Cat 27 105
TableA Cat 128 92
TableB Lizard 83 NULL
Or a version dealing with the moved goalposts.
SELECT DISTINCT CA.*
FROM TableA A
FULL OUTER JOIN TableB B
ON EXISTS (SELECT A.* INTERSECT SELECT B.*)
CROSS APPLY (SELECT 'TableA' AS what, A.* UNION ALL
SELECT 'TableB' AS what, B.*) AS CA
WHERE NOT EXISTS (SELECT A.* INTERSECT SELECT B.*)
AND CA.Col1 IS NOT NULL
ORDER BY CA.Col1, CA.Col2
For tables with many columns it can still be difficult to identify the specific column(s) that differ. For that you can potentially use the below.
(though just on relatively small tables as otherwise this method likely won't have adequate performance)
SELECT t1.primary_key,
y1.c,
y1.v,
y2.v
FROM t1
JOIN t2
ON t1.primary_key = t2.primary_key
CROSS APPLY (SELECT t1.*
FOR xml path('row'), elements xsinil, type) x1(x)
CROSS APPLY (SELECT t2.*
FOR xml path('row'), elements xsinil, type) x2(x)
CROSS APPLY (SELECT n.n.value('local-name(.)', 'sysname'),
n.n.value('.', 'nvarchar(max)')
FROM x1.x.nodes('row/*') AS n(n)) y1(c, v)
CROSS APPLY (SELECT n.n.value('local-name(.)', 'sysname'),
n.n.value('.', 'nvarchar(max)')
FROM x2.x.nodes('row/*') AS n(n)) y2(c, v)
WHERE y1.c = y2.c
AND EXISTS(SELECT y1.v
EXCEPT
SELECT y2.v)
Best Answer
There are none that I know of that will show you the exact results - maintaining hooks so such tools could spy on what is going on would probably be inefficient in production (no doubt MS can do it in their development builds). While intermediate results aer sometimes dumped to temporary tables for further processing, they are often never materialised as a whole unit instead being streamed between parts of the query engine as they are produced.
If you run the query in SQL Server Management Studio, you can request that the query plan used is output. While that will not give you the entire intermediate results it will show you what comparisons are being performed and approximately how many rows are effectively output at each step - if you have a Cartesian product between a pair of large tables this should hopefully make it clear where in the process this happens.
For more specific help you'll need to add more detail to your question (please add it to the question, not as comments here, then there is better visibility of the new information for people who visit this page later):