I have one large table that is partitioned (table name: Trans
). At the moment this table is to be created on 32 partitions. This table will contain approximately 300 million records and data older than 14 days will be deleted daily. One of the columns in this table is a reference to a table that will contain up to 5 million records (table name: Sens
) and I also want it to be partitioned. I would like to ask you about:
-
Will it be a problem that both tables will use the same partitioning function? So the
Sens
table would also be distributed over 32 partitions and would be save on the same files as theTrans
table. Is this a good approach ? -
The
Trans
table has aPK
based on two columnsTranID (Identity (1,1))
andParititionID
. At the moment,FK
to a smaller table ('Sens') is based on only one column –SenID
. The smaller table also has to be partitioned. What will be the difference in the approach / efficiency / speed of operation if thePK
in theSens
table will only be on theIDENTITY (1,1)
column instead of theIDENTITY (1,1)
column and thepartition column
, i.e.
ALTER TABLE [dbo].[Sen]
ADD CONSTRAINT [PK_SenID]
PRIMARY KEY CLUSTERED ([SenID] ASC) ON [PRIMARY];
-- or
ALTER TABLE [dbo].[Sen]
ADD CONSTRAINT [PK_SenID]
PRIMARY KEY CLUSTERED (
[SenID] ASC,
[PartitionID]
) ON [psTrans]([PartitionID])
- Have you ever try to have partition column which is computed ? I am thinking about choose partition according to new column which is computed base on other column in table:
CAST(HASHBYTES('MD5', [othercolumnInTable]) AS tinyint) % 32
Best Answer
For the first question, there's no issue with using the same function for two tables as long as the definition of the partition function doesn't ever need to change. You mentioned deleting daily data and your partition function is planned to contain 32 partitions, so I assume that you're creating one partition per day of the month. If so, I can't see a reason why you would need to merge or split partitions. With that said, given that you just have two tables involved there isn't anything wrong with creating two separate functions as well.
Whether or not the two tables should exist on the same database files depends on how you're defining your files and filegroups. Given that you're only keeping 14 days of data and you purge the rest, I assume that you aren't putting some partitions on different classes of storage. 5 million rows in the
Sens
table really isn't a large number, so it's likely true that there will be no issue with using the same database files with no tables. This is just a guess based on incomplete information.For the second question, evaluate whether or not a partition-aligned index is the right fit for the table:
My own opinion is that you shouldn't partition a table with 5 million rows unless you have a very good reason. Partitioning is a feature designed for large tables. Why do you need to partition a table with 5 million rows?
For the final question, I have worked with partitioned tables that have computed partition columns. I do not recommend it unless you have no other choice. We consistently ran into odd issues including partition elimination not working as expected.