Sql-server – Partitioning a table on a column in another table

partitioningsql server

I have a fact table called FactAccount which has the following columns:

  1. AccountId (identity)
  2. AccountMonthId – FK referring to DimMonth
  3. AccountYear

The DBA has done the partitioning on the AccountYear (2014, 2015, 2016…) column. But this table will be queried on the AccountMonthId column (with a join to DimMonth) where users will select the year and month from the DimMonth dimension table.

So in this case, I don't think the partitioning will make any difference as the WHERE clause doesn't filter by AccountYear.

The DimMonth dimension contains all combinations of year and month, i.e. 12 dimension members for each year and the AccountMonthId column is an identity column, not a natural key.

Is there a way to partition a table on a column in a table referenced by a foreign key? Or, to be very specific in my case, is it possible to partition the FactAccount table on DimMonth.AccountMonthId?

Best Answer

Short answer: No, you cannot partition by a column in another table, because that would cause problems whenever you change something in the "other" table - rows would magically have to move between partitions.

Summarizing some of the views in the comments from Aaron, Kenneth and me on your question:

  • You could change AccountMonthId from a sequential number to a natural key, in this case an integer column from which you can compute the year and month. For example, the value 201503 could correspond to march 2015. This type of key is very common for date dimensions, because date dimensions don't change, so you can hard-code their keys.

  • With a natural key for AccountMonthId you could partition on this column, RANGE RIGHT FOR VALUES (201400, 201500, 201600) or RANGE LEFT FOR VALUES (201499, 201599, 201699).

  • Whether you have dates or just month values in your table, you could use a date column to store dates/months in your fact tables and in the month dimension. This would not only save storage space (three bytes, as opposed to four with int), it would also allow for your queries to use date functions among other gains. The same logic still applies with regards to partitioning.

  • You can still achieve what's known as "partition elimination" (querying a single partition) in your current setup, as long as you include the partition column in your query. With your current setup, that could look something like Select * from FactAccount A inner join DimMonth m on a.AccountMonthId = m.MonthId where M.Month = 2 and m.Year = 2016 and a.AccountYearId = 2016

  • You could partition on months instead of years. It'll take up the same disk space, although there are perhaps more partitions to manage, but this seems like a minor cost.

  • Adding an extra column for the year and partitioning on that column seems a bit lazy, ignorant, or at least strange of your DBA.

  • We have hardly touched on the subject of indexing - indexes, be they aligned or non-aligned, are often much more important than partitioning as performance goes.