Sql-server – How much overhead is involved with querying a DB on the same server

performancesql server

History

The company I work for had two databases, DB A and DB B, on two different servers. A contains information pulled from a COTS application database in addition to information from our own developed applications. This was done to ensure the COTS application database was never disrupted as it is required 24/7/365.

Some of our users required raw data from A, thus B was born. B pulls data from A into it's own local tables and provides views for users. (I am not a fan of this design at all)

Now

Upgrades and migrations happened. B eventually found it's way onto the same server as A. We are considering cleaning up our database structure and have come up with several options. Keep in mind that access to data is a top priority for the users, but we want to cut down on costs. For simplicity, lets call the COTS database Z

Each option begins with: A pulls data from Z via OpenQuery.

  1. The data is stored in 2 different tables on A, one for applications, and one for the views. The views are moved to A. This method benefits from never having data unavailable to users if an application lock the first table and being able to remove B all together.
  2. The data is stored in 1 table in A. A copy is also stored in B at either the same time or later via scheduled procedure (I'm thinking the former). The views stored in B will continue to access the data in B
  3. The data is stored in 2 different tables on A, similar to option 1. Database B is maintained in order to strictly provide views to the users from A. This would require the least amount of work for us and our tech illiterate users. Though, it seems silly to maintain a DB just because we are too lazy to set up some accounts and permissions.

I would really love to be able to have an option where data from Z is stored in 1 table in A, and everyone plays nicely, but that's not the case. Although we are the owners of this data, there are other groups that utilize it too. They've managed to lock our database on several occasions, so we're being safe.

I am the type of person who likes to try to make things as efficient as possible, so I am really leaning toward option 1, but many of my coworkers are liking option 3 and some even option 2. 1 and 2 are pretty darn similar, so I was wondering what kind of overhead is involved with accessing another database within the same server.

Perhaps there is an option we have not considered?

EDIT

Note: The tables in question are read only.

Best Answer

If you just want a readable copy of the data in table A, then there are ways to make this efficient without worrying about whether the copy is in the same database or a different database. Or a different server, for that matter, as long as you can reliably point your application differently depending on whether it is reading or writing, and as long as there isn't a bunch of referential integrity tied to A.

Let's keep things simple and have everything in the same database. Create two new schemas:

CREATE SCHEMA fake   AUTHORIZATION dbo;
CREATE SCHEMA shadow AUTHORIZATION dbo;

Now create a copy of A in the dbo and shadow schemas:

SELECT * INTO shadow.B FROM dbo.A;
SELECT * INTO dbo.B    FROM dbo.A;

(Then add all of the appropriate indexes, etc. to both tables.)

Now, create a stored procedure that does the following:

TRUNCATE TABLE shadow.B;

INSERT shadow.B SELECT * FROM dbo.A;

-- optionally add an UPDATE STATISTICS command here

BEGIN TRANSACTION;

  ALTER SCHEMA fake TRANSFER     dbo.B;
  ALTER SCHEMA dbo  TRANSFER  shadow.B;

COMMIT TRANSACTION;

ALTER SCHEMA shadow TRANSFER fake.B;

-- optionally truncate the now stale copy, though I keep
-- this around for debugging / comparisons:
TRUNCATE TABLE shadow.B;

Then schedule it to run in a job (frequency will depend on your tolerance for how long this process takes and how fresh the copy of the data needs to be). You can always call it manually if you want it to be refreshed in between runs.

As I mentioned above, unless you use the Auto-Close feature, the overhead of accessing a table in a different database on the same instance is negligible compared to accessing the same table in the same database. So the process or any related efficiency doesn't really change that much if you put B in a different database (just an extra prefix on the INSERT/SELECT, and the pointing of the application as mentioned above). The process also doesn't change that much if you move the copy to a different server (though linked servers, distributed transactions etc. make this a little more cumbersome). You will see the process might be a little slower across servers, but users shouldn't notice unless they also now have to join data between the two servers - if they can query the other server independently to pull data from B they won't be able to tell the difference, unless the other server is severely underpowered.

If you have referential integrity you may find that you will need to drop/re-create and/or disable/re-enable constraints in order to schema transfer (I've almost always used this trick for relatively static lookup data and, as a copy primarily used for reporting, did not have DRI). Also both copies of A will have their own statistics, so half the time you'll have one set of statistics, and half the time you'll have the other set. I haven't done extensive testing of what happens with query plans etc. after the schema transfer.

Of course if you store B in a different database then perhaps the DRI issues etc. are not all that important anyway, but I thought I would mention them.

I have, however, blogged a bit more about how I arrived at this technique:

http://www.sqlperformance.com/2012/08/t-sql-queries/t-sql-tuesday-schema-switch-a-roo