Why CTEs Are Faster Than Temp Tables in SQL Server Queries

sql server

I have 2 different queries for identifying dups (below). The only difference between the 2 queries is that one uses a CTE and the other uses a #Temp table.

Does anyone know why the CTE is so much faster than the #Temp table (0:20 seconds VS. 1:22)?

I would rather use the CTE but I need to run 2 STATEMENTS using the CTE (DELETE from CTE and then INSERT INTO table FROM CTE) but SQL Server only allows you to write one STATEMENT on the CTE.

Query 1:

WITH DUPS AS(
   Id,
   Column1,
   Column2,
   Column3,
   RN = ROW_NUMBER()OVER(PARTITION BY ID ORDER BY id)
   FROM mytable
)
  Select top 1 * FROM DUPS WHERE RN > 1 

Query 2:

 SELECT    
           Id,
           Column1,
           Column2,
           Column3,
           RN = ROW_NUMBER()OVER(PARTITION BY ID ORDER BY id)
       INTO #DUPS
       FROM mytable

       Select top 1 * FROM #DUPS WHERE RN > 1 

Best Answer

I believe if you look at the execution plans you'll see that the CTE gives you a better plan because it can take better advantage of the TOP. The Temp table will have to create the table in TempDB, persist all of the data to disk, and then select the top record. If you remove the TOP from both queries I bet you'll get much closer performance. That being said I would generally expect the CTE to out perform the temp table in a single use scenario like you have here since the CTE is doing less work than creating/inserting into a temp table.

This is probably just sample code but it is worth mentioning your results are non-determinate because you're using TOP without an ORDER BY clause.

For further reading you should check out this question. This answer to that question seems to apply to your situation in particular.


For the implied second half of your question:

Since you say you need to use the data from the CTE at least twice then you have four options:

  1. Copy and paste your CTE to both places
  2. Use the CTE to insert data into a Table Variable, and use the data in the table variable to perform the next two operations.
  3. Use the CTE to insert data into a Temp Table, and use the data in the temp table to perform the next two operations.
  4. Write a better tailored CTE. I assume you're doing different things so the queries must be slightly different.

I tend to prefer the option 2 (table variable) or option 4 (tailored CTE's) approach. I don't like the duplication and extra maintenance of copy/pasted CTE's. I also like the explicitly reduced scope of the table variable over a temp table. For an authoritative treatment on the differences between table variables and temp tables check out this answer. As far as choosing between options 2 and 4; that will depend on whatever your profiling dictates and/or what constraints you have to work within.