Sql-server – Tool to view intermediate query results

sql servertools

I am using SQL Server 2008 Express edition and I've worked two days on a problem where seems to be a cross join in running a query.

I would want to know the set returned per join operation visually.

Is there any tool that can give you intermediate query results? I couldn't find one on Google.

EDIT

I solved this problem with the oldest trick in the book, manually running the query before joining to see where exactly the problem of cross join was happening. And found out that there was a mapping DB that sent 3 results for a join instead of just one (because it needed two conditions for a proper inner join, instead of one).

Best Answer

There are none that I know of that will show you the exact results - maintaining hooks so such tools could spy on what is going on would probably be inefficient in production (no doubt MS can do it in their development builds). While intermediate results aer sometimes dumped to temporary tables for further processing, they are often never materialised as a whole unit instead being streamed between parts of the query engine as they are produced.

If you run the query in SQL Server Management Studio, you can request that the query plan used is output. While that will not give you the entire intermediate results it will show you what comparisons are being performed and approximately how many rows are effectively output at each step - if you have a Cartesian product between a pair of large tables this should hopefully make it clear where in the process this happens.

For more specific help you'll need to add more detail to your question (please add it to the question, not as comments here, then there is better visibility of the new information for people who visit this page later):

What makes you think there is a cross join happening? Are you getting many more rows than you expect then whittling them down with a DISTINCT? If it is just that the query is running very slowly and spinning the CPU then it may instead be a index use problem or a correlated subquery that is resulting in a scan per output row (a situation that can actually be worse than a cross join performance wise)
Could you provide a copy of the query and relevant table structures?
What size of data are we talking about?

Related Solutions

Sql-server – Easily show rows that are different between two tables or queries

You don't need 30 join conditions for a FULL OUTER JOIN here.

You can just Full Outer Join on the PK, preserve rows with at least one difference with WHERE EXISTS (SELECT A.* EXCEPT SELECT B.*) and use CROSS APPLY (SELECT A.* UNION ALL SELECT B.*) to unpivot out both sides of the JOINed rows into individual rows.

WITH TableA(Col1, Col2, Col3) 
     AS (SELECT 'Dog',1,1     UNION ALL 
         SELECT 'Cat',27,86   UNION ALL 
         SELECT 'Cat',128,92), 
     TableB(Col1, Col2, Col3) 
     AS (SELECT 'Dog',1,1     UNION ALL 
         SELECT 'Cat',27,105  UNION ALL 
         SELECT 'Lizard',83,NULL) 
SELECT CA.*
FROM   TableA A 
       FULL OUTER JOIN TableB B 
         ON A.Col1 = B.Col1 
            AND A.Col2 = B.Col2 
/*Unpivot the joined rows*/
CROSS APPLY (SELECT 'TableA' AS what, A.* UNION ALL
             SELECT 'TableB' AS what, B.*) AS CA     
/*Exclude identical rows*/
WHERE  EXISTS (SELECT A.* 
               EXCEPT 
               SELECT B.*) 
/*Discard NULL extended row*/
AND CA.Col1 IS NOT NULL      
ORDER BY CA.Col1, CA.Col2

Gives

what   Col1   Col2        Col3
------ ------ ----------- -----------
TableA Cat    27          86
TableB Cat    27          105
TableA Cat    128         92
TableB Lizard 83          NULL

Or a version dealing with the moved goalposts.

SELECT DISTINCT CA.*
FROM   TableA A 
       FULL OUTER JOIN TableB B 
         ON EXISTS (SELECT A.*  INTERSECT  SELECT B.*) 
CROSS APPLY (SELECT 'TableA' AS what, A.* UNION ALL
             SELECT 'TableB' AS what, B.*) AS CA     
WHERE NOT EXISTS (SELECT A.*  INTERSECT  SELECT B.*) 
AND CA.Col1 IS NOT NULL
ORDER BY CA.Col1, CA.Col2

For tables with many columns it can still be difficult to identify the specific column(s) that differ. For that you can potentially use the below.

(though just on relatively small tables as otherwise this method likely won't have adequate performance)

SELECT t1.primary_key,
       y1.c,
       y1.v,
       y2.v
FROM   t1
       JOIN t2
         ON t1.primary_key = t2.primary_key
       CROSS APPLY (SELECT t1.*
                    FOR xml path('row'), elements xsinil, type) x1(x)
       CROSS APPLY (SELECT t2.*
                    FOR xml path('row'), elements xsinil, type) x2(x)
       CROSS APPLY (SELECT n.n.value('local-name(.)', 'sysname'),
                           n.n.value('.', 'nvarchar(max)')
                    FROM   x1.x.nodes('row/*') AS n(n)) y1(c, v)
       CROSS APPLY (SELECT n.n.value('local-name(.)', 'sysname'),
                           n.n.value('.', 'nvarchar(max)')
                    FROM   x2.x.nodes('row/*') AS n(n)) y2(c, v)
WHERE  y1.c = y2.c
       AND EXISTS(SELECT y1.v
                  EXCEPT
                  SELECT y2.v)

Sql-server – How to use merge hints to isolate complex queries in SQL Server

If you use a multi-statement UDF, then your inner select is executed exactly once for each outer row. The multi-statement UDF is treated as a black box: the execution plan will now show access to the objects used in your complex view.

On the other hand, a subquery and/or an inline UDF is flattened out by the optimizer. When this is the case, the execution plan will include access to the objects used in your complex view.

EDIT

Best Answer

Related Solutions

Sql-server – Easily show rows that are different between two tables or queries

Sql-server – How to use merge hints to isolate complex queries in SQL Server

Related Question