Sql-server – Multiple Partition Keys

partitioningscalabilitysql server

This is a follow-on from a previous question.

I've never worked with SQL Server partitioning but I currently faced with designing a database for which the volumes probably warrant it. The system is for coupons. The coupons are to be issued periodically, usually every six weeks although there will also be ad-hoc issuance – eg for a special event. There are 15 million customers and for each issuance event, every customer will receive 6 different coupon types, giving a total of 90 million coupon instances. We need to track the coupon instance redemption data and maintain this for 6 months, although typically a coupon is only valid for six weeks. Any redemption request for an invalid coupon will not reach the database because it will be validated by the POS till.

Over a six month period we'll need to store up 1 billion rows in the Coupon Instance table and up to 200 million (assuming max 20% redemption rate) in the Redemption table.

There will be two main queries to these tables:
1. select from coupon instance where coupon barcode = x (for redemption)
2. select from coupon instance where loyalty card number = x (to issue a coupon via till printer)

To help with the 1st query I could partition the instance table by issuance event and then embed the issuance event in the barcode. So, if we have an issuance event that creates 50m records in the instance table, we have a single partition for these records – SQL is then able to go to the correct partition for a given barcode substring (a 2 digit issuance event number)

What about the 2nd query? I guess it's not possible to partition two ways? If I were to implement this using database sharding then my sharding algorithm would dictate that I need to store the same record on multiple shards in 50% of records. This would work but it's expensive in storage space and I'd like to avoid sharding because of the "re-balancing" required when an addition shard is added

Is there any special way partitioning can help here or should I simply partition to help with the 1st query and resign myself to seek of a full table high index for the 2nd?

Thanks

Rob

Best Answer

You really don't need partitioning for this to work efficiently, which is what I told you on your last question as well...even if you have billions of rows.

If you cluster on the BarcodeID (which I am assuming is unique) and put a nonclustered index on LoyaltyCardID it should work just fine. These are NOT complicatd queries with a lot of additional logic from the sounds of things, and simple seek operations are extremely efficient on their own.

Are you getting pressure to partition or is it just something that you have decided to do?