Sql-server – Adding a Clustered index to a HEAP table on someone else’s app

clustered-indexindex-tuningoptimizationperformancesql server

We're using a proprietary application based on SQL Server 2005, which has many HEAP based tables (that is, no Clustered index). Over the years, these tables have grown badly fragmented (e.g. 99% fragmentation). I need to defragment them.

Now, in SQL Server 2005, there's no way to do that directly. I can either:

  1. Create a Clustered index on a prexisting field, and keep it
  2. Create that index, rebuild the table, and then drop it
  3. Add my own field (eg autoincrement key)

Now, this is a large app, written by a vendor – a giant black box. I'm not eager to mess around with their stuff. I don't know a lot about how the tables are used. What's the least impact way to do this?

And, as a follow up: There are many of these HEAP based tables, over a dozen databases used by the application. Is there a way to automate choice #2 or #3? Or, how should I pick which tables to modify?


UPDATE:
To answer the questions posed by the (very helpful) responses:

  • Performance has been unacceptable. The vendor told my client that they this is because the tables are extremely fragmented, and it's the client's responsibility to defragment them regularly
  • We have two instances of the application: a test one and a production one
  • I first defragmented all of the Clustered tables, which greatly improve performance
  • The vendor's support team has been very unhelpful. They've told us it's our responsibility to defragment the tables. When I've asked them how they recommend defragmenting HEAP based tables – should I add a Clustered Index? – they've only responded "That's a Microsoft question".

In short, the customer support team has made it clear: You must defrag the tables, how you do it is your business.

As for future versions: Yes, new versions are being developed, and they will eventually migrate to SQL Server 2012. But they need performance solutions today.

Finally, as for the defragmenting taking too long: It doesn't matter. They have giant tables with 99% fragmentation; the application isn't used at night; I can easily spend hours at a time defragging them.

Best Answer

Fragmented heaps matter. The larger the heap, the harder it is to walk the data. Performance suffers.

Creating and dropping the CI will reorder the rows in the table as you are aware. If the vendor offers no solution and it is your responsibility to address the fragmentation, that's exactly what I have done in the past. I wouldn't leave the CI if it were me, I'd drop it. I would avoid adding a column since that does alter the function of the table permanently, and the same argument could be made if you leave the CI in place.

If you do this be sure that you have appropriate storage for creating the CI since it will require storage at least equal to the size of the heap, and also consider when you do it re: your service level requirements.