I am shredding an xml blob that is being passed to me and loading a temp table for further processing. One of the things I noticed while working through the process is that performing an INSERT INTO
is taking like 20 minutes while doing a SELECT INTO
is taking seconds. The DB statistics are up to date and we are running daily maintenance plans so not sure what is going on. I know that SELECT INTO
bypasses the logs which would make it faster, but I as not assuming that the INSERT INTO
would be so much worse.
EDIT: Based on comments adding some more details
Why would the INSERT INTO be taking so much longer than the SELECT INTO
for a temp table.
Query example below. I am shredding XML and inserting into a temp Table I have both the INSERT INTO
and SELECT INTO
commented out. The INSERT INTO
for just 100 records is showing 382,000+ logical reads while the SELECT INTO
is showing none.
;with cteScen AS (
SELECT DENSE_RANK() OVER ( ORDER BY scenario_node ) AS scenario ,
RANK() OVER ( PARTITION BY scenario_node ORDER BY scen_val ) AS level_order ,
scen_val.value('text-value[1]',
'varchar(100)') AS 'scen_value' ,
scen_val.value('(distribution/@id)[1]',
'int') AS 'dist'
FROM @scenarioXML.nodes('/scenarios/scenario')
AS scenario ( scenario_node )
CROSS APPLY scenario.scenario_node.nodes('./values/value')
AS x ( scen_val ))
--INSERT INTO #tmpTbl(Scenario, Level_Order, Scenario_Value, Distribution_ID)
Select top 100 *
--INTO #tmpTbl
From cteScen
Any thoughts?
Best Answer
There are a couple of connect items that might be relevant here.
An INSERT statement using XML.nodes() is very very very slow in SQL2008 SP1
Poor xml performance with table variable
Bad performance when inserting data from element-centric documents
INSERT from .nodes with element-based XML has poor performance on SP2 with x64
I have not been able to reproduce what you see with your query so I can not say for sure what is happening for you but one issue was that there was an eager spool step in the query plan that was not needed. That eager spool can be removed by using a trace flag in the first fixed versions of SQL Server. I am not sure if that is needed in later versions. The eager spool was in there "due to a general halloween protection logic". That might explain why you see the bad performance when you insert to an already existing table. If you have a look at the query plans for both your queries you should see the difference if that is the case.
There are also some workarounds suggested that you can try and I think you should try to specify the text node when you fetch
text-value
.If you don't do that SQL Server will aggregate all text values below
text-value
. For your XML you probably don't see a difference in the result but iftext-value
was a mixed content node with both text and sub nodes you would.Result: