Sql-server – INSERT INTO #temp taking 1000 times longer than SELECT INTO #temp

sql serversql-server-2008-r2

I am shredding an xml blob that is being passed to me and loading a temp table for further processing. One of the things I noticed while working through the process is that performing an INSERT INTO is taking like 20 minutes while doing a SELECT INTO is taking seconds. The DB statistics are up to date and we are running daily maintenance plans so not sure what is going on. I know that SELECT INTO bypasses the logs which would make it faster, but I as not assuming that the INSERT INTO would be so much worse.

EDIT: Based on comments adding some more details

Why would the INSERT INTO be taking so much longer than the SELECT INTO for a temp table.

Query example below. I am shredding XML and inserting into a temp Table I have both the INSERT INTO and SELECT INTO commented out. The INSERT INTO for just 100 records is showing 382,000+ logical reads while the SELECT INTO is showing none.

;with cteScen AS (
SELECT    DENSE_RANK() OVER ( ORDER BY scenario_node ) AS scenario ,
        RANK() OVER ( PARTITION BY scenario_node ORDER BY scen_val ) AS level_order ,
        scen_val.value('text-value[1]',
                       'varchar(100)') AS 'scen_value' ,
        scen_val.value('(distribution/@id)[1]',
                       'int') AS 'dist'
FROM      @scenarioXML.nodes('/scenarios/scenario')
        AS scenario ( scenario_node )
        CROSS APPLY scenario.scenario_node.nodes('./values/value')
        AS x ( scen_val ))
--INSERT INTO #tmpTbl(Scenario, Level_Order, Scenario_Value, Distribution_ID)
Select top 100 *
--INTO #tmpTbl
From cteScen

Any thoughts?

Best Answer

There are a couple of connect items that might be relevant here.

An INSERT statement using XML.nodes() is very very very slow in SQL2008 SP1
Poor xml performance with table variable
Bad performance when inserting data from element-centric documents
INSERT from .nodes with element-based XML has poor performance on SP2 with x64

I have not been able to reproduce what you see with your query so I can not say for sure what is happening for you but one issue was that there was an eager spool step in the query plan that was not needed. That eager spool can be removed by using a trace flag in the first fixed versions of SQL Server. I am not sure if that is needed in later versions. The eager spool was in there "due to a general halloween protection logic". That might explain why you see the bad performance when you insert to an already existing table. If you have a look at the query plans for both your queries you should see the difference if that is the case.

There are also some workarounds suggested that you can try and I think you should try to specify the text node when you fetch text-value.

scen_val.value('(text-value/text())[1]', 'varchar(100)') AS 'scen_value' ,

If you don't do that SQL Server will aggregate all text values below text-value. For your XML you probably don't see a difference in the result but if text-value was a mixed content node with both text and sub nodes you would.

declare @XML xml = '
<root>
  <item>item text <subitem>sub item text</subitem>
  </item>
</root>'

select @XML.value('(/root/item)[1]', 'varchar(50)') as C1,
       @XML.value('(/root/item/text())[1]', 'varchar(50)') as C2

Result:

C1                           C2
---------------------------- ----------------------------
item text sub item text      item text