Sql-server – Using schema + Filegroups or entirely separate database(same instance) to partition some big tables

database-designsql serversql server 2014

I have the following challange:

I have a staging database which holds a few big tables (on SSD) but I start to run out of space on the SSDs. I use SQL server 2014 BI-Edition (= Standard Edition for the relational engine) so table partitioning out of the box is not available. My approach would be the splitting of the data into separate tables and joining them together in a view.

The data itself is (mostly) written once and then only read from.

What would be a better solution of the two suggested ones below? Why?

Solution A: Use Filegroups to move data to different disks:

  • create a two new filegroup on a separate disks (non-SSD, RAID5)
  • create several tables (e.g. one per Quarter Per Year) and move the old data to those tables on some rules (e.g. move data older thant 6 months)
  • one filegroup is readonly and the other one is read write. More current data resides in the read write filegroup and the older data is on the readonly filegroup
  • create the same index (same columns) for all tables
  • use a view to have transparent access to all data

Solution B:Use a database to hold the historic data

  • Implemented in more or less the same way but addng a database (in the same instance) as an additional layer

Thanks for sharing your ideas and advice.

Best Answer

Your problem is apparently all about storage space.

Your SSD array is not large enough to hold all your history on SSD.

Creating another filegroup is a good way to separate the use of storage space. That way you control what goes onto the historic data disk array. (When creating objects on a filegroup other than default, you must be sure to specify the filegroup you are using of course. )

Depending on how massive the historic data is, you may not need to split the data by quarters, etc.

For example, you were running well on the SSDs until space became a problem. So, remembering that space is the problem, you could ask yourself questions like:

  1. Are my indexes well-designed for this workload?
  2. Do my queries require scanning all of the historic data? (Apparently not, according to your comments above.)
  3. Is the overhead of maintaining quarterly databases going to buy you significant query improvements? (Especially if you will need data from more than one database.)

Making things simpler and well-designed should be a first goal. But, if that is not sufficient then you should add layers as they are needed. Keeping performance statistics will help you measure the benefits.

There are plenty of web sites with help. Here is one set of queries by Glenn Berry: http://www.sqlskills.com/blogs/glenn/category/dmv-queries/