Sql-server – Inserting 500K records into SQL Server 2008 is taking 15s; how to optimize

execution-planinsertoptimizationsql server

I have a SQL Server 2008 machine that is running backed by a SAN (not entirely certain of the SAN configuration). I've been noticing that some queries are slow to respond, so have been running some tests trying to optimize/index the slow parts. I am not a SQL Server DBA (background is more MySQL centric), but have the task of improving this performance.

My tests are the following. I created 2 tables:

CREATE TABLE "Z_SIMULATION_0" 
(
    "ID" INT NOT NULL,
    "VALUE" INT NOT NULL,
    PRIMARY KEY ("ID")
);

CREATE TABLE "Z_SIMULATION_1_TABLE" 
(
    "ID" BIGINT NOT NULL,
    "ELEMENT" BIGINT NOT NULL,
    "flag" INT NULL DEFAULT NULL,
    "IS_ACTIVE" BIT NOT NULL DEFAULT b'0',
    PRIMARY KEY ("ID", "ELEMENT")
);

The indexes for the Z_SIMULATION_1_TABLE as exported by SSMS are:

CREATE NONCLUSTERED INDEX [IX_Z_SIMULATION_1_IS_ACTIVE] 
    ON [dbo].[Z_SIMULATION_1_TABLE] ([IS_ACTIVE] ASC)
       WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, 
             SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, 
             ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, FILLFACTOR = 90) ON [PRIMARY]

CREATE NONCLUSTERED INDEX [IX_Z_SIMULATION_INDEX_ACTIVE] 
    ON [dbo].[Z_SIMULATION_1_TABLE] ([ID] ASC, [IS_ACTIVE] ASC)
    INCLUDE ([ELEMENT], [flag]) 
       WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, 
             SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, 
             ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, FILLFACTOR = 90) ON [PRIMARY]

CREATE NONCLUSTERED INDEX [IX_ZSIMULATION1_ELEMENT] 
    ON [dbo].[Z_SIMULATION_1_TABLE]( [ELEMENT] ASC)
       WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, 
             SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, 
             ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, FILLFACTOR = 90) ON [PRIMARY]

CREATE NONCLUSTERED INDEX [IX_ZSIMULATION1_ID] 
    ON [dbo].[Z_SIMULATION_1_TABLE] ([ID] ASC)
       WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, 
             SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, 
             ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, FILLFACTOR = 90) ON [PRIMARY]

ALTER TABLE [dbo].[Z_SIMULATION_1_TABLE] 
  ADD CONSTRAINT [PK_Z_SIMULATION_1_TABLE] 
  PRIMARY KEY NONCLUSTERED ([ID] ASC, [ELEMENT] ASC)
      WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, 
            SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, ONLINE = OFF, 
            ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, FILLFACTOR = 90) ON [PRIMARY]

I then populated the Z_SIMULATION_0 table with 600K random (non-repeating) ID/Value pairs (IDs are between 1 & 600k).

Finally, I executed this query:

INSERT INTO Z_SIMULATION_1_TABLE (ID, ELEMENT, IS_ACTIVE) 
    SELECT 2, value, 0 
    FROM Z_SIMULATION_0 
    WHERE ID>=1 AND ID <= 500000;

This query took 14s to complete, which I find is significantly too long. I don't see why an insert into a blank table should take this long, and more importantly, I'm not sure how to optimize things to make it any faster.

This is a screenshot of the Execution Plan from SQL Server Management Studio.

I see the Sort (for the index I presume?) is taking the bulk of the time, but not sure what to do about that.

How can I improve the performance/reduce the amount of time it takes to do the insert? I'm open to changing/modifying/deleting indexes as required or necessary.

Best Answer

It isn't just doing one insert operation per row, there is one insert per row per index with the associated sorts too. Each of those sorts may be spooling to disk (and probably is with that much data) to there is a lot of IO going on.

When completely rebuilding a table's contents (i.e. starting with a blank table) it is usually more efficient to drop or disabling all the indexes except the clustered index (if the table has one) before inserting the rows then recreating the indexes afterwards is often more efficient.

Update

If you can't remove the indexes because they may be used, then do some analysis as @ypercube implies to ensure that they are indeed required and useful and remove any that are of doubtable benefit. You may of course be able to temporarily drop some of the indexes if not all. Experiment to see if you would benefit from a clustered index (rules of thumb say no in this example, but rules of thumb do not always apply so running your own application specific benchmarks is often a good idea).

Having said that, if you are only doing a partial update in real life the issue is moot: if you are only inserting enough data to increase the table by 1% in size then dropping the indexes would never be more efficient anyway. Also it would be worth performing your benchmarks on realistically sized data rather than an empty table if the table in real life is not going to be empty.

If you have ruled out index changes, or done as much as you can in that direction, the only thing that you can do from there is to improve the IO performance. As we know nothing of your IO subsystem we can't help you much at the moment, but a few general tips include keeping tempdb on different drives to your main database, for some IO patterns keeping logs on different spindles can help too, or look to spreading the indexes between filegroups on different drive, and so forth. If you have enterprise edition so can use partitioning and your insert blocks map to a nice partitioning scheme, you could use partition switching to do the inserts faster then switch the whole block in in one do. If your data stores are SAN based, try moving tempdb to local drives. If you have control to change the hardware perhaps have some/all structures on SSDs. Adding more ram may help but probably not: you'd have to add enough that all those index sort operations could all happen in RAM all at the same time and SQL Server might still spool to disk (because it considered the RAM better used for something else) and even if you can arrange for the whole sorting operation to be done in RAM the data still has to be written to physical media at the end, there is no avoiding that.

There are a lot of articles (and whole books) out there on the subject of optimising your database IO system unfortunately. If you list your current disk technolog{y|ies} and layout someone might be able to make suggestions for practical things to try first. No matter what you try, remember to run realistic application benchmarks: artificial benchmarks are useful as a quick and easy gauge of what to expect, but sometimes a complexity of your real workload will mean that a massive benefit seen in artificial benchmarks in dev/tes is all but destroyed in production.