Quick Information: Our EDW group has created On-Premise SQL Data Warehouse by collecting information from several systems. These SQL DW databases are made up of:
- SQL Version: SQL Server 2016 (SP2)
- Partitioned by: Year
- Total Database Size: 10TB
- For HA: Utilizing Always On (1-Primary & 1-Secondary)
- Environment: VMWare
- Change Tracking Enable on Couples of tables under EDW
EDW developers have created Views and StoredProcedures for end user (i.e. BI Analytics, Application…) to consume the data, this way, whatever changes or modification EDW group is doing to underlying objects on their end is not directly affecting end user. Application Developer (i.e. OLTP App) group who are consuming these Views/Storedprocedures, so they are not allowed to make changes to these EDW objects nor are allow to create adh-hoc queries from OLTP application, as it is also very critical for EDW group to get their load times done under certain hours. We are looking at these Views and Stored procedures, to see if we can tune them for performance.
So far whatever I have read online I see that, it is not a good practice to utilized EDW for OLTP load as, OLTP application are chatty and they mostly need is recent data and not so much of historical data.
Question: What are your all suggestions on, how can we utilize our current EDW data for these OLTP applications, so that, we are not sacrificing Load Times for Performance and vice-versa?
Best Answer
Can you have your cake and eat it, too?
nope.
But there are some things you can do. I've worked with this EHR software called Nextgen. It has an OLTP database that never deletes/archives any data, so it's also the data warehouse.
Short story: it sucks.
less-short story: You can keep things kinda-not-slow if you cluster every table by some kind of date, so the most recent data is on top.
You also have to pay attention to DUI's, making sure they don't lock tables for too long at a time. I would sometimes resort to cursors, which made the overall duration of the DUI longer, but reduced the time of each individual lock.
I also avoided triggers, using jobs running every 5-15 minutes instead, whenever possible.
Here's the script I used to get the whole database at once: