Sql-server – Will table partitioning speed up the queries

partitioningsql serversql-server-2012

My database is set up with what we call our staging tables and our live tables. Currently for we have about 2 million rows for a table called products. Our customers we call manufacturers load their data into the staging tables and when they are done updating, adding, deleting data we then push that data into our live tables.

These live tables feed a web service for a mobile app and websites. The pushing of data from staging to live consists of deleting all the data in the live tables and inserting all of the data from the staging tables. All of this is segregated by a column called ManufacturerID that exists in every table.

Some manufacturers have 500 products, some have 75,000. Depending on this we sometimes have REALLY REALLY slow web service responses because of paging through all of the 2 million records. Deleting the data from the live tables also seems to be getting ridiculously slow.

Would partitioning my products table by ManufacturerID help this situation? From what I have read this would basically mean when I am querying my products I would only be querying a small subset of the database on that ManufacturerID and therefore see a huge improvement in overall response time.

Best Answer

No. Partitioning does not speed things up. Your table that has 2 million records before partitioning, it will continue to have exactly the same 2 million records after partitioning. If you want to look only at a small subset of the records use an index. It does seem like your data is a multi-tenant schema with the tenant key be the ManufacturerID. In such cases the most likely design is to have the ManufacturerID be a leading key in the clustered indexes. BTW, I recommend reading Multi-Tenant Data Architecture.

Partitioning would help in scenarios involving fast data switch-in and switch-out, or in scenarios requiring distribution of data do different physical paths (filegroups). A good read is How To Decide if You Should Use Table Partitioning.

Before you say 'but what about partition elimination?' I'll say this: there's almost nothing partition elimination can do that an index cannot do better. Not to mention that partition elimination is a lottery while range scan on leading key in clustered index is pretty much a sitting duck. Besides, how many tenants will you have per partition? And how will you distribute them evenly, consider how unbalanced they seem to be? Both these problems are, again, addressed much better by a leading key in the clustered index. And, not least, consider the huge design changes partitioning brings (e.g. kiss goodbye the unique/primary key constraints or introduce unaligned indexes), see Special Guidelines for Partitioned Indexes, pay special attention to the 'Memory Limitations' section.

Partitioning is an awesome feature, but is not a performance feature. An example of what partitioning could help is to do a very fast switch-in of the staged data (which is a problem you raised), but that would require one partition per manufacturer, and it will not scale, assuming that each 'manufacturer' is a client of your business (a tenant).