Sql-server – SQL Server — optimization techniques for rebuilding very large table

sql serversql-server-2008

We generally use SQL Compare to generate upgrade/change scripts, and we also use the "Force column order" option by default. This basically means that the column order will be preserved going from database version A to B, so in situations where this is relevant a table rebuild becomes necessary.

Well, somebody decided it would be a good idea to sandwich a new column in-between existing columns when making a schema change to a table containing a very large amount of data (obviously ill-advised). Here are some more details of what I am faced with:

  • The table has about 1.3 billion rows
  • A clustered index exists on the table
  • A few non-clustered indexes exist on the table
  • The server the database instance is running on is very limited from a resource perspective

The script generated by SQL Compare handles this by dropping all indexes on the table, throwing the data into a temp table, dropping the table, re-creating the new table, and inserting the data from the temp table into the new table schema. I feel like this can be optimized. Here is what I came up with to do so:

  1. bcp data out to file
  2. disable indexes
  3. truncate table
  4. rebuild table based on new schema
  5. bcp data into new table schema
  6. enable indexes

One obvious problem is at some point the table has to be dropped and recreated if I am not mistaken since the column order is changing. Dropping the table will also drop the indexes that I would like to keep in place, wouldn't it? Has anybody been faced with a similar situation? Any advice as to whether or not my plan looks like a better one to reduce potential down time? Any ideas or advice will be greatly appreciated.

Best Answer

Code that relies on the order of columns is asking for trouble anyway. It should never be a problem, since as a DBA, you're profiling and identifying SELECT * statement, and ostracizing the perpetrators, correct?

When I do have to perform such an operation, due to some dumb legacy software, I would follow these steps:

  1. rename existing table and named constraints
  2. create table with new schema
  3. bulk insert from old to new table. if required, break on the id and do it in batches of 100,000* records using a while loop (* - a number that suits the hardware & environment)
  4. drop old table - reclaim space here
  5. create indexes with proper fillfactors
  6. create constraints (after indexes)