Sql-server – Large table and poor performance – what to do next

hardwarepartitioningsql serversql-server-2008-r2

We have a Windows 2008 R2 / SQL Server 2008 R2 (Standard) server that is used to host a single database. The database itself mainly consists of a single large table containing both live and historical data. The table is currently 101 million rows of 35 columns, and growing at the rate of around 250,000 rows a day. Splitting the table into smaller tables unfortunately isn't really an option due to a mass of legacy code.

The database itself (around 100Gb) is held as a single file on a single SSD drive. The server has another two 10K SAS disks used for the boot OS, paging etc, and the server has 22Gb of RAM.

Although everything's running fine, we have a few dozen users who need to query the data in this table. We have limited control over what these queries do: sometimes it's pulling a few hundred rows from yesterday, at other times it's tens of thousands of rows from 6 months ago. 99.9% of the activity is reading rows; there is very little writing apart from the live data being INSERTed throughout the day. At peak times, simple queries that return a lot of data can take half an hour or more to complete.

We have indexes in place that are helping, but the ultimate bottleneck appears to be disk I/O. The SSD in place isn't the fastest, and as a result we're looking at retrofitting a RAID1+0 array of high-end SSD drives to increase performance (we've checked the array card can handle the throughput).

Assuming we have this array in place, what is the best plan to increase read throughput to this database? If we have a super-fast SSD array, is that enough? Or would partitioning the database into separate files on separate logical drives be a better idea, even though they're essentially destined for the same disks? Similarly, would splitting database and log files across logical drives in the same array make any difference?

Best Answer

If you were using Enterprise Edition, I would recommend partitioning. Separate files on separate logical drives is unlikely to provide much help, since the data is all going through the same controller and writing to the same underlying disk (and without partitioning, you don't have much control over proportional fill, meaning many queries will likely still have to hit several or all files anyway).

Since you're not using Enterprise Edition, one alternative would be to split the data into multiple tables, and then have views that combine them. Assuming the older data no longer gets updated, you could even put it on read-only filegroups, which will alleviate some resource contention. You can make the query logic as complicated and picky as you want; for example, the simplest solution would be to just issue all queries against the view, but your data access layer or stored procedure logic could - based on the date range parameters - decide which table(s) to access at runtime. You can even use filtered indexes so that certain views will pull a very limited subset of data even from the subset that exists in its own table. This isn't necessarily an easy button but I can elaborate on some of these details if you think you may be interested in putting in some elbow grease to implement something.

The cheapest fix, of course, would be to throw hardware at the problem: specifically, such that more of your database will fit into memory. RAM is cheap and assuming your box supports it I would say that an upgrade from 22 GB (what an odd number?) to 128 64 GB will go a long way. (And if the OS is Standard also, even 32 GB might help, but not as much.) While I'd love to see an actual execution plan and statistics I/O metrics for a query that takes half an hour to return from an SSD, if a larger portion of the data is in memory it will be that much faster.