SQL Server – How to Pull Data from Identical Tables in Different Databases

sql server

I'm dealing with a situation in SQL Server 2008 where each client is setup as their own database, with each database having identical table structures.

I have a query that selects from across ~20 different tables within a database, and inserts the records into a reporting database. What we'd like to do is repeat this process for each database, so that all reporting information across all the client databases are then contained in a consolidated reporting database.

The "easy" way would be to just copy/paste the code and do a find/replace all on the database names to have the script execute across all clients. Is there any better way than that though?

Best Answer

This answer assumes you have a master database/table of clients and their associated database names, or some way to figure out which databases in a given instance of SQL Server have relevant databases that you want to query.

Since this is for reporting purposes, you may want to implement some form of snapshotting technique to ensure the data sets consolidate to the same point in time, on at least a per-client basis. As a nice side effect, these techniques usually also alleviate locking/blocking effects of running these types of queries directly against the production data.

Pure Dynamic SQL
Using the master list, build up a SQL statement that uses 3-part names like you're doing now, but instead, inject the database name dynamically. It's unclear from the question what format the data is going to end up in, but it sounds like you want to UNION ALL everything together, which is easily accomplished.

This may be advantageous for you now because it will require very little up-front work.

The downside is that this approach is not very flexible and quickly becomes complicated if you need to run the query when the individual database revisions are different (i.e., the query was tied to a given database revision, and when a database update was deployed, it didn't succeed for all clients). It also may have problems or complications depending on the security model you're using for these databases.

Database-Based Object(s) + Dynamic SQL
This solution creates permanent objects within every database which you can query in an external process. Usually this means a view, stored procedure, or table-valued function. Dynamic SQL will be used to inject the database names into the query as above, but this time all that's needed is to select from, or execute, the database object instead of the raw query directly.

This method gives all the advantages of protecting your code behind an interface just like it would by using views, stored procs, etc., in application code. This also solves the security issues much more easily.

The disadvantage, of course, is that this needs a little more up-front work and management, and that the queried object(s) can only be deployed in database updates, instead of independently.


My personal opinion is that the latter approach is a much better long-term solution, so in this type of scenario, that's the one I would opt for. For ad-hoc types of queries, using the pure dynamic SQL approach is fine.