SQL Server Performance – Should a Schema Be Created Per Project?

performanceschemasql server

This is a database where users create a 'project' and work on it. We have a table for project and various other tables that contain different attributes for project (multiple rows for each project from 2-30K rows). All of them contains projectID as link to project. Currently the database is 400GB.

We are trying to create a schema for each project wherein each schema would contain all attribute tables. Whenever a project is created, it would get a new schema. This would be mean each table would contain at max 30K rows which would improve select performance. We would be using dynamic SQL in our queries to read/insert.

I feel like this would not be proper application of database but how do I prove to my team that this a bad approach?

Best Answer

Read this paper: Multi-Tenant Data Architecture. It extensively discusses on the pros and cons of 3 possible approaches (tenant ID in each table, schema per tenant, database per tenant). This approach is called "Shared database, Separate schemas" and is a valid approach. Whether it is best for your case, I'm not in position to make that call. This is what the paper has to say about this approach:

A significant drawback of the separate-schema approach is that tenant data is harder to restore in the event of a failure. If each tenant has its own database, restoring a single tenant's data means simply restoring the database from the most recent backup. With a separate-schema application, restoring the entire database would mean overwriting the data of every tenant on the same database with backup data, regardless of whether each one has experienced any loss or not. Therefore, to restore a single customer's data, the database administrator may have to restore the database to a temporary server, and then import the customer's tables into the production server—a complicated and potentially time-consuming task.

The separate schema approach is appropriate for applications that use a relatively small number of database tables, on the order of about 100 tables per tenant or fewer. This approach can typically accommodate more tenants per server than the separate-database approach can, so you can offer the application at a lower cost, as long as your customers will accept having their data co-located with that of other tenants.

Read the paper.

which would improve select performance

If you are trying to do this split for performance reasons, you and your team are definitely on the wrong path. Address the perf issues by identifying them and fix them appropriately. Read How to analyze SQL Server performance. Multi-tenancy ain't cheap, and you need to plan ahead and have an answer to simple questions like 'how do I deploy the next version of my app, which includes a database modification?' and 'how to restore the data for one particular project?'. The issues you see now are indicative of app tunning problems (hint: you're missing indexes).

As a personal experience note: I've seen databases with millions of objects in them (which the 'shared database, separate schema' approach is naturally going to cause), and it wasn't pretty... I hope all your DB administration is 100% scripted/automated now, expanding SSMS tree views with to select the 1 in 1M objects is not going to work.

Related Question