Sql-server – Multiple Non-Indexed Views with INCLUDEs vs. Multiple Indexed Views in High Write Situations

index-designmaterialized-viewsql servert-sqlview

When you want to encapsulate your T-SQL to select different subsets of data from the same base tables, is it more efficient to use plain views in conjunction with nonclustered indexes and INCLUDEs on the base tables, or is it better to use multiple indexed views, even when writes are frequent?

Background: I've run into a design issue that could cause me a lot of problems down the line if I handle it incorrectly, so I'd like some feedback on how best to approach it. Essentially, I have a series of tables consisting mainly of float columns which I need to join in hundreds of queries, each of which retrieves myriad subsets of the same columns and joins them together in similar but not always identical ways. For ease of maintenance, code legibility, modularization and the like, I'd to encapsulate as much of the commonalities in T-SQL across the queries as possible. For example, in the sample code below, I select a slightly different list of columns in the second query than in the first, plus join to a third table; there are dozens of such permutations of similar SQL statements scattered across hundreds of queries. Most of the queries occur in stored procedures that perform one or more UPDATEs, plus some rare DELETEs or INSERTs.

Stored Procedure Example 1

; WITH CTE1
AS
(SELECT T1.ID, Column1, Column2, Column3, Column4, Column3InTable2, Column5InTable3
FROM Table1 AS T1
    INNER JOIN Table2 AS T2
    ON T1.ID = T2.ForeignKeyID
)

UPDATE T1
SET Column1 = Whatever
FROM Table1 AS T1
    INNER JOIN CTE1 AS T1
    ON T1.ID = T2.ID

Stored Procedure Example 2

; WITH CTE1
AS
(
SELECT T1.ID, Column1, Column2, Column3, Column5, Column3InTable2, Column2InTable3, Column3InTable3
FROM Table1 AS T1
    INNER JOIN Table2 AS T2
    ON T1.ID = T2.ForeignKeyID
        INNER JOIN Table3 AS T3
        ON T1.ID = T3.ForeignKeyID
)

UPDATE T1
SET Column3InTable3 = Whatever
FROM Table1 AS T1
    INNER JOIN CTE1 AS T1
    ON T1.ID = T2.ID

What I'd like to use are simplified retrieval structures like this:

CREATE VIEW View1
AS
SELECT T1.ID, Column1, Column2, Column3, Column4, Column3InTable2, Column5InTable3
FROM Table1 AS T1
    INNER JOIN Table2 AS T2
    ON T1.ID = T2.ForeignKeyID

CREATE VIEW View2
AS
SELECT T1.ID, Column1, Column2, Column3, Column5, Column3InTable2, Column2InTable3, Column3InTable3
FROM Table1 AS T1
    INNER JOIN Table2 AS T2
    ON T1.ID = T2.ForeignKeyID
        INNER JOIN Table3 AS T3
        ON T1.ID = T3.ForeignKeyID

Stored Procedure Example 1 Updated

UPDATE T1
SET Column1 = Whatever
FROM View1

Stored Procedure Example 2 Updated

UPDATE View2
SET Column3InTable3 = Whatever

From experience, I've already learned that retrieving the data through table-valued functions leads to poor performance, which improved dramatically when I created a single indexed temporary table on all of the combinations of columns these queries need. Unfortunately, creating different temp tables that retrieve only the subsets of data I need for each query quickly turns into a maintenance and coordination nightmare. Therefore, I still need to refer to complicated joins to the same broad temp table in every procedure, which doesn't help me modularize things at all. Ideally, I'd like to use views to encapsulate the code (look at how much easier it is to read the samples above when they refer to views), but I imagine that creating different indexed views for each of the dozens of base queries I need would rapidly degrade performance, since all of the base tables are frequently updated on almost a 1:1 basis for each read. Could I get around this by instead using a series of non-indexed views to operate only on the columns I need, while indexing only the base tables with a series of INCLUDE clauses tailored to each subset of columns, or am I doomed to run into the same performance degradation due to the frequent UPDATEs?

Thankfully, I only need to update one base table in any given statement, so the restriction against updating multiple tables in a single view isn't an issue (typically, I only need to retrieve the other tables in order to calculate the new values of the updated columns). It's mainly the frequency of the updates which is complicating my efforts at encapsulation. After reading these Microsoft articles on Designing Indexed Views and indexing as well as the replies in the thread Using indexed views for aggregates – too good to be true? I haven't yet seen anything that would discourage me from using this approach; perhaps there's a better one I haven't thought of yet though. I've also toyed with the idea of building these views upon each other hierarchically to save even more code, but don't know if that would further complicate things. Thanks in advance for any advice.

Best Answer

If your previous multi-statement table-valued functions (MSTVF) solved your problem but at the cost of performance, then you should rewrite those functions to be inline table-valued functions (ITVF).

MSTVFs execute once for EVERY ROW. Switching to an inline table-valued function and then CROSS APPLYing that function to your query will mean the function is only run once. You should see significant performance improvement with this approach.

See Wayne Sheffield's Comparing Inline and Multi-Statement Table-Valued Functions post for more information.