Sql-server – Splitting a large table

partitioningsql serversql-server-2000sql-server-2008-r2

My scenario is I look after an old system with a VB6 based front end and a SQL back end (good news, the replacement is being developed but in the meantime I need to keep this up). Now the application is used at a dozen or so sites most of which are on SQL2000, one is on SQL2008 and one on SQL2008R2 (all Standard). Changing versions/editions would be a big ask.

The one hosted on SQL2008R2, as it happens, is also the largest site. We now have a table with over two billion rows and 700GB of data. It's a small number of columns, one being a 256 char varchar used to store serialised blob data. This is now of a size that when people try to insert, delete and update to it (via the app) users occasionally lock each other out. This can only get worse as time goes on. Also however, the next largest is on SQL2000..

I need to get the size of this table down. What are some methods or techniques I could use to achieve this? Ideally the technique would be compliant with SQL2000 Standard edition so I could use it at all sites. However just fixing the biggest offender on SQL2008R2 Standard would be something.

The table does have an identity column and as the table definition is controlled by the app this can not be changed (I mention this as I believe this rules out a partitioned view- the rows are INSERTed, UPDATEd and DELETEd by the app).

My current thought is to delete all data past date X months in the past (50% of total) and hive it off to a table not exposed to the user or the app. Then in the app I can put a method for a user to request items of data to be copied back across and this would be done via a scheduled job (overnight possibly). This feels clunky to me however.

Best Answer

We now have a table with over two billion rows and 700GB of data. It's a small number of columns, one being a 256 char varchar used to store serialised blob data. This is now of a size that when people try to insert, delete and update to it (via the app) users occasionally lock each other out. This can only get worse as time goes on.

Why do users lock each other out? Is it that they are all accessing the same rows? Perhaps the real solution to improve performance is query and index tuning so that only the needed data are touched regardless of table size.

Archiving read-only historical data will improve performance of full scans, improve manageability of current data, and perhaps reduce I/O due to a the smaller working set of active data. You could simply move old data to an archive table in the same database during a scheduled daily process. Rather than have users invoke a separate process to copy the data back, consider using a UNION ALL view of current and archived data when users need to report on both current and read-only historical data.