Sql-server – Create index on 106 million row table

indexsql server

I have an application that is installed world wide that uses SQL versions from SQL Server 2008 Express to SQL Server 2017 Enterprise edition.

I have tried several ways to create an index (3 columns with an include of 3 columns) on a big table where the Database is usually in simple recovery model but can be as Full recovery model… Hardware can vary drastically from one customer top another:

  1. Alter table create index…
  2. Create a tmp table with the new
    index and copy the data with an "insert into" followed with a drop
    table of the original table and a rename of the tmp table to replace
    the original one.
  3. Create a tmp table with the new index and use a
    PowerShell script to copy the data followed by a drop table of the
    original table and a rename of the tmp table to replace the original
    one.

The quickest of the 3 was the 1st option (the traditional way of creating a non-clustered index) which took 1h15m for a bit over 106 million records (the table has 200 columns… not ideal but it's what I'm working with) the other options took triple the time and more)

Unfortunately, I can't use the online feature since there are a variety of SQL Server versions and the online is only good for enterprise edition.

An hour and 15 minutes in my lab is too long and I would like to try to find another way to bring that down, specially that in one of our next application updates, we will need to create 23 indexes… and I haven't seen an update that takes 5-6 hours nor will customers want to wait 5-6 hours for an update of a software.

Also, the update will require no activity on the database during the update.

All I'm looking for are ideas I can try to get my indexes created in a reasonable amount of time. No code! just need theoretical ideas.

Any idea would be greatly appreciated.

Best Answer

I'm leaving this as an answer, even though it's mostly an Extended Comment®.

You don't mention the index definitions, or the types of columns you're trying to index. You say the table is 200 columns, but unless you're creating a clustered index, that shouldn't matter. Unless you're trying to create a nonclustered index on 200 columns. If so, please reconsider.

Given that you're doing this across many different versions, it's likely that they're all on different hardware as well. In the case of creating indexes, hardware and existing indexes will have a dramatic impact on how long it takes to complete. It's not like you can tune the query behind index creation. Likewise, the ability for Enterprise Edition to parallelize index creation can be an important factor.

Aside from hardware, system concurrency can play a part as well, either via blocking or overall resource usage. You don't mention if these are issues, so again it's a blind spot for anyone trying to answer.

In newer versions of SQL Server, you can choose to sort index creation in tempdb. This can help, as long as tempdb isn't Full Metal Potato© across your different environments.

Finally, you don't mention the recovery model of your databases. Since CREATE INDEX can be minimally logged, it may be worth switching to SIMPLE or BULK LOGGED recovery model if possible. This will depend on your RPO and RTO goals, and if they can be dismissed for a window to create the indexes.

A quick comment on step 2 that relates a bit to the comment above: If you create a table with no indexes, then do your data load and create indexes, it will likely be faster. Especially in SIMPLE or BULK LOGGED, you can use the TABLOCK hint to potentially get minimal logging on the insert as well as the index creation.

Your question will probably be closed as too broad, but I hope you find this Extended Comment® helpful.