Sql-server – Aggregating Multiple COTS Databases into a Data Warehouse or similar for Reporting

data-warehousereportingsql serverssrs

I am sure this is a very basic question, so I apologize, but I've read lots of articles and done a lot of google searches and have not found any leads for this particular challenge I am looking to address. I appreciate any guidance that can be provided.

I have an enterprise environment where we need to have multiple instances of a COTS application spread geographically around the world. These applications are using SQL Server Enterprise Editions 2008, 2012, and 2014 (different versions of the COTS software) and have to remain in their geographic isolation. I have created an SSRS server to facilitate being able to give the customers access to their specific geographically isolated COTS database data. However, the demand has now grown where we want the users to be hitting a data warehouse for the SSRS data so we don't risk impacting our production systems. Additionally, my team would like to be able to aggregate the data for some enterprise reporting.
I have read a bunch on data warehousing but I can't seem to find any solutions to bringing 14 different instances of a single COTS application into a single SQL server for reporting against. Ideally we'd like to be able to track the changes as well. The 3 databases per COTS instance are all named the same regardless of the version. Two of the databases are relatively static (maybe 200 updates per day each), but one is very active (70k+ writes per day). The solution doesn't have to be real time, but does need to at least offer a daily load.

Database 1 ~ 600 MB and 200 writes/day

Database 2 ~ 1.4 GB and 200 writes/day

Database 3 ~ 35-120 GB (depending on location) and 70k writes/day (all application events and audit) [old entries are never modified so would only need to send the delta]

So how can I bring in 42 Vendor Databases into a single data warehouse? Is there an approach better than the data warehouse to pursue? Thank you very much in advance, and if there is anything else that is needed to offer a solution, please let me know.

Best Answer

It appears that you just want a copy of all your databases centrally located, which could then be used to run the same report for different regions, just by changing the datasource?

You could acheive this with a single Server and a single instance.

While I wouldn't call this a data warehouse (per se), but it might be the simplest solution to solve your requirements.

With this approach you would end up with 42 databases on your instance and you could name them anything thing you like, but I would suggest including the source location in the name of the database

- for example
   [Source Server].SourceDB    -->      [Dest Server].DestDB
  [Sydney.xyz.com].Database1   -->  [Central.xyz.com].Sydney_Database1
  [Sydney.xyz.com].Database2   -->  [Central.xyz.com].Sydney_Database2
[Brisbane.xyz.com].Database1   -->  [Central.xyz.com].Brisbane_Database1
[Brisbane.xyz.com].Database2   -->  [Central.xyz.com].Brisbane_Database2
etc...

You could achieve this via

  • Replication ,or
  • Log Shipping

Both of these solutions are relatively simple (assuming you have network access between all your nodes), and replication you only need to replicate speific objects (tables, procs etc) that you use in your reports (whereas log shipping will be the whole database), but replication can be problematic to troubleshoot.