SQL Server – How to Pull Data from Identical Tables in Different Databases

sql server

I'm dealing with a situation in SQL Server 2008 where each client is setup as their own database, with each database having identical table structures.

I have a query that selects from across ~20 different tables within a database, and inserts the records into a reporting database. What we'd like to do is repeat this process for each database, so that all reporting information across all the client databases are then contained in a consolidated reporting database.

The "easy" way would be to just copy/paste the code and do a find/replace all on the database names to have the script execute across all clients. Is there any better way than that though?

Best Answer

This answer assumes you have a master database/table of clients and their associated database names, or some way to figure out which databases in a given instance of SQL Server have relevant databases that you want to query.

Since this is for reporting purposes, you may want to implement some form of snapshotting technique to ensure the data sets consolidate to the same point in time, on at least a per-client basis. As a nice side effect, these techniques usually also alleviate locking/blocking effects of running these types of queries directly against the production data.

Pure Dynamic SQL
Using the master list, build up a SQL statement that uses 3-part names like you're doing now, but instead, inject the database name dynamically. It's unclear from the question what format the data is going to end up in, but it sounds like you want to UNION ALL everything together, which is easily accomplished.

This may be advantageous for you now because it will require very little up-front work.

The downside is that this approach is not very flexible and quickly becomes complicated if you need to run the query when the individual database revisions are different (i.e., the query was tied to a given database revision, and when a database update was deployed, it didn't succeed for all clients). It also may have problems or complications depending on the security model you're using for these databases.

Database-Based Object(s) + Dynamic SQL
This solution creates permanent objects within every database which you can query in an external process. Usually this means a view, stored procedure, or table-valued function. Dynamic SQL will be used to inject the database names into the query as above, but this time all that's needed is to select from, or execute, the database object instead of the raw query directly.

This method gives all the advantages of protecting your code behind an interface just like it would by using views, stored procs, etc., in application code. This also solves the security issues much more easily.

The disadvantage, of course, is that this needs a little more up-front work and management, and that the queried object(s) can only be deployed in database updates, instead of independently.

My personal opinion is that the latter approach is a much better long-term solution, so in this type of scenario, that's the one I would opt for. For ad-hoc types of queries, using the pure dynamic SQL approach is fine.

Related Solutions

Sql-server – standard way to use common data and procedures across multiple databases

In addition to Thomas' answer, for the things like common lookup data and generic functions, you can use synonyms.

USE CentralDB;
GO

CREATE TABLE dbo.Dates(...);
GO

USE OtherDB;
GO

CREATE SYNONYM dbo.Dates
  FOR CentralDB.dbo.Dates;

Now your other databases can treat these like first-class, local objects, but you only need to maintain one copy. Note that some functionality is missing, e.g. you can't set up foreign keys.

In a previous life I managed a system where we would have ~500 databases with nearly identical schema on each instance. We managed common data using synonyms, and deployment for code that had to be local using Red-Gate SQL Compare an SQL Farms Combine (very much like Multi-Script).

Sql-server – SQL Server – Copy data from one database to another – identical tables, new identity values needed

Unless I'm missing something your problem is not moving the data it's dealing with the identity values that are already set up. If that is the case then try this.

Pick a value greater than your current ident values on either DB. I would pick a round value, say 1,000,000.
Pick the ident values you want to change (for example if you have lookup tables that are the same for both DBs then you probably want to leave them the same.)
When you move your data add the value you selected in step 1 to the idents value you want to change as you move them.

As long as you are consistent in the value you are adding all of your relationships will stay the same into the combined DB. So for example

You have an employee table with a current max id of 200,000 in DB A and 1,400,000 in DB B. You decide to move the data from DB A to DB B because it means moving less data. As you move your employee table you add 2,000,000 to the Employee_Id column.

USE DatabaseB
GO

SET IDENTITY_INSERT Employee ON

INSERT INTO Employee (Employee_Id, Other_Columns)
SELECT Employee_Id + 2000000, Other_Columns
FROM DatabaseA.dbo.Employee

SET IDENTITY_INSERT Employee OFF
GO

And last piece of advice back up everything before you start in case you make a mistake :)

Best Answer

Related Solutions

Sql-server – standard way to use common data and procedures across multiple databases

Sql-server – SQL Server – Copy data from one database to another – identical tables, new identity values needed

Related Question