I have a fact table called FactAccount
which has the following columns:
AccountId
(identity)AccountMonthId
– FK referring toDimMonth
AccountYear
The DBA has done the partitioning on the AccountYear
(2014, 2015, 2016…) column. But this table will be queried on the AccountMonthId
column (with a join to DimMonth
) where users will select the year and month from the DimMonth
dimension table.
So in this case, I don't think the partitioning will make any difference as the WHERE
clause doesn't filter by AccountYear
.
The DimMonth
dimension contains all combinations of year and month, i.e. 12 dimension members for each year and the AccountMonthId
column is an identity column, not a natural key.
Is there a way to partition a table on a column in a table referenced by a foreign key? Or, to be very specific in my case, is it possible to partition the FactAccount
table on DimMonth.AccountMonthId
?
Best Answer
Short answer: No, you cannot partition by a column in another table, because that would cause problems whenever you change something in the "other" table - rows would magically have to move between partitions.
Summarizing some of the views in the comments from Aaron, Kenneth and me on your question:
You could change
AccountMonthId
from a sequential number to a natural key, in this case an integer column from which you can compute the year and month. For example, the value 201503 could correspond to march 2015. This type of key is very common for date dimensions, because date dimensions don't change, so you can hard-code their keys.With a natural key for
AccountMonthId
you could partition on this column,RANGE RIGHT FOR VALUES (201400, 201500, 201600)
orRANGE LEFT FOR VALUES (201499, 201599, 201699)
.Whether you have dates or just month values in your table, you could use a date column to store dates/months in your fact tables and in the month dimension. This would not only save storage space (three bytes, as opposed to four with int), it would also allow for your queries to use date functions among other gains. The same logic still applies with regards to partitioning.
You can still achieve what's known as "partition elimination" (querying a single partition) in your current setup, as long as you include the partition column in your query. With your current setup, that could look something like
Select * from FactAccount A inner join DimMonth m on a.AccountMonthId = m.MonthId where M.Month = 2 and m.Year = 2016 and a.AccountYearId = 2016
You could partition on months instead of years. It'll take up the same disk space, although there are perhaps more partitions to manage, but this seems like a minor cost.
Adding an extra column for the year and partitioning on that column seems a bit lazy, ignorant, or at least strange of your DBA.
We have hardly touched on the subject of indexing - indexes, be they aligned or non-aligned, are often much more important than partitioning as performance goes.