Sql-server – Database stored procedure with a “preview mode”

sql servert-sql

A fairly common pattern in the database application I work with is a need to create a stored procedure for a report or utility that has a "preview mode". When such a procedure does updates, this parameter indicates that the results of the action should be returned, but the procedure should not actually perform the updates to the database.

One way to accomplish this is to simply write an if statement for the parameter, and have two complete code blocks; one of which does updates and returns data and the other just returns the data. But this is undesirable because of the code duplication and a relatively low degree of confidence that the preview data is actually an accurate reflection of what would happen with an update.

The following example attempts to leverage transaction savepoints and variables (which are not affected by transactions, in contrast to temp tables which are) to use just a single block of code for the preview mode as the live update mode.

Note: Transaction rollbacks are not an option since this procedure call may itself be nested in a transaction. This is tested on SQL Server 2012.

CREATE TABLE dbo.user_table (a int);
GO

CREATE PROCEDURE [dbo].[PREVIEW_EXAMPLE] (
  @preview char(1) = 'Y'
) AS

CREATE TABLE #dataset_to_return (a int);

BEGIN TRANSACTION; -- preview mode required infrastructure
  DECLARE @output_to_return TABLE (a int);
  SAVE TRANSACTION savepoint;

  -- do stuff here
  INSERT INTO dbo.user_table (a)
    OUTPUT inserted.a INTO @output_to_return (a)
    VALUES (42);

  -- catch preview mode
  IF @preview = 'Y'
    ROLLBACK TRANSACTION savepoint;

  -- save output to temp table if used for return data
  INSERT INTO #dataset_to_return (a)
  SELECT a FROM @output_to_return;
COMMIT TRANSACTION;

SELECT a AS proc_return_data FROM #dataset_to_return;
RETURN 0;
GO

-- Examples
EXEC dbo.PREVIEW_EXAMPLE @preview = 'Y';
SELECT a AS user_table_after_preview_mode FROM user_table;

EXEC dbo.PREVIEW_EXAMPLE @preview = 'N';
SELECT a AS user_table_after_live_mode FROM user_table;

-- Cleanup
DROP TABLE dbo.user_table;
DROP PROCEDURE dbo.PREVIEW_EXAMPLE;
GO

I'm looking for feedback on this code and design pattern, and/or if other solutions to the same problem exist in different formats.

Best Answer

There are several flaws to this approach:

  1. The term "preview" can be quite misleading in most cases, depending on the nature of the data being operated on (and that changes from operation to operation). What is to ensure that the current data being operated on will be in that same state between the time the "preview" data is gathered and when the user comes back 15 minutes later -- after grabbing some coffee, stepping outside for a smoke, walking around the block, coming back in, and checking something on eBay -- and realizes that they didn't click the "OK" button to actually perform the operation and so finally clicks the button?

    Do you have a time-limit on proceeding with the operation after the preview is generated? Or possibly a way to determine that the data is in the same state at modification time as it was at initial SELECT time?

  2. This is a minor point as the example code could have been done hastily and not represent a true use-case, but why would there be a "Preview" for an INSERT operation? That could make sense when inserting multiple rows via something like INSERT...SELECT and there could be a variable number of rows inserted, but this doesn't make much sense for a singleton operation.

  3. this is undesirable because of ... a relatively low degree of confidence that the preview data is actually an accurate reflection of what would happen with an update.

    Where exactly does this "low degree of confidence" come from? While it is possible to update a different number of rows than show up for a SELECT when multiple tables are JOINed and there is duplication of rows in a result set, that should not be an issue here. Any rows that should be affected by an UPDATE are selectable on their own. If there is a mismatch then you are doing the query incorrectly.

    And those situations where there is duplication due to a JOINed table that matches multiple rows in the table that will be updated are not situations where a "Preview" would be generated. And if there is an occasion where this is the case, then it needs to be explained to the user that they are updated a subset of the report that is repeated within the report so that it does not appear to be an error if someone is only looking at the number of affected rows.

  4. For the sake of completeness (even though the other answers mentioned this), you are not using the TRY...CATCH construct so could easily run into issues when nesting these calls (even if not using Save Points, and even if not using Transactions). Please see my answer to the following Question, here on DBA.SE, for a template that handles transactions across nested Stored Procedure calls:

    Are we required to handle Transaction in C# Code as well as in stored procedure

  5. EVEN IF the issues noted above were accounted for, there is still a critical flaw: for the short-period of time the operation is being performed (i.e. prior to the ROLLBACK), any dirty-read queries (queries using WITH (NOLOCK) or SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED ) can grab data that isn't there a moment later. While anyone using dirty-read queries should already be aware of this and have accepted that possibility, operations such as this greatly increase the chances of introducing data anomalies that are very difficult to debug (meaning: how much time do you want to spend trying to find a problem that has no apparent direct cause?).

  6. A pattern like this also degrades system performance by both increasing blocking by taking out more locks, and generating more Transaction Log activity. (I see now that @MartinSmith also mentioned these 2 issues in a comment on the Question.)

    Additionally, if there are Triggers on the tables being modified, that could be quite a bit of additional processing (CPU and Physical/Logical reads) that is unnecessary. Triggers would also further increase the chances of data anomalies resulting from dirty reads.

  7. Related to the point noted directly above -- increased locks -- the use of the Transaction increases the likelihood of running into deadlocks, especially if Triggers are involved.

  8. A less severe issue that should relate only to the less-likely scenario of INSERT operations: the "Preview" data might not be the same as what is inserted with regards to column values determined by DEFAULT Constraints (Sequences / NEWID() / NEWSEQUENTIALID()) and IDENTITY.

  9. There is no need for the additional overhead of writing the contents of the Table Variable into the Temporary Table. The ROLLBACK wouldn't affect the data in the Table Variable (which is why you said you were using Table Variables in the first place), so it would make more sense to simply SELECT FROM @output_to_return; at the end, and then don't even bother creating the Temporary Table.

  10. Just in case this nuance of Save Points is not known (hard to tell from the example code as it only shows a single Stored Procedure): you need to use unique Save Point names so that the ROLLBACK {save_point_name} operation behaves as you are expecting it to. If you re-use the names, a ROLLBACK will roll-back the most recent Save Point of that name, which might not be at the same nesting level where the ROLLBACK is being called from. Please see the first example code block in the following answer to see this behavior in action: Transaction in a stored procedure

What this comes down to is:

  • Doing a "Preview" doesn't make a lot of sense for user-facing operations. I do this frequently for maintenance operations so that I can see what will be deleted / Garbage Collected if I proceed with the operation. I add an optional parameter called @TestMode and do an IF statement that either does a SELECT when @TestMode = 1 else it does the DELETE. I sometimes add the @TestMode parameter to Stored Procedures called by the application so that I (and others) can do simple testing without affecting the state of the data, but this parameter is never used by the application.

  • Just in case this was not clear from the top section of "issues":

    If you do need/want a "Preview" / "Test" mode to see what should be affected if the DML statement were to be executed, then do NOT use Transactions (i.e. the BEGIN TRAN...ROLLBACK pattern) to accomplish this. It is a pattern that, at best, only really works on a single-user system, and not even a good idea in that situation.

  • Repeating the bulk of the query between the two branches of the IF statement does present a potential problem of needing to update both of them each time there is a change to make. However, differences between the two queries are usually easy enough to catch in a code review and easy to fix. On the other hand, problems such as state differences and dirty reads are much harder to find and fix. And the problem of decreased system performance is impossible to fix. We need to recognize and accept that SQL is not an Object-Oriented language, and encapsulation / reducing duplicated code was not a design-goal of SQL like it was with many other languages.

    If the query is long / complex enough, you can encapsulate it in an Inline Table-Valued Function. Then you can do a simple SELECT * FROM dbo.MyTVF(params); for the "Preview" mode, and JOIN to the key value(s) for the "do it" mode. For example:

    UPDATE tab
    SET    tab.Col2 = tvf.ColB
           ...
    FROM   dbo.Table tab
    INNER JOIN dbo.MyTVF(params) tvf
            ON tvf.ColA = tab.Col1;
    
  • If this is a report scenario as you mentioned it could be, then running the initial report is the "Preview". If someone wants to change something they see on the report (a status perhaps), then that doesn't require an additional preview since the expectation is to change the currently displayed data.

    If the operation is to perhaps change a bid amount by a certain % or business rule, then that can be handled in the presentation layer (JavaScript?).

  • If you really need to do a "Preview" for an end-user-facing operation, then you need to capture the state of the data first (maybe a hash of all of the fields in the result set for UPDATE operations or the key values for DELETE operations), and then, before performing the operation, compare the captured state info with the current info -- within a Transaction doing a HOLD lock on the table so that nothing changes after doing this comparison -- and if there are ANY difference, throw an error and do a ROLLBACK rather than proceed with the UPDATE or DELETE.

    For detecting differences for UPDATE operations, an alternative to computing a hash on the relevant fields would be to add a column of type ROWVERSION. The value of a ROWVERSION datatype automatically changes every time there is a change to that row. If you had such a column, you would SELECT it along with the other "Preview" data, and then pass it along to the "sure, go ahead and do the update" step along with the key value(s) and value(s) to change. You would then compare those ROWVERSION values passed-in from the "Preview" with the current values (per each key), and only proceed with the UPDATE if ALL of the values matched. The benefit here is that you don't need to calculate a hash which has the potential, even if unlikely, for false-negatives, and takes some amount of time each and every time you do the SELECT. On the other hand, the ROWVERSION value is incremented automatically only when changed, so nothing you ever need to worry about. However, the ROWVERSION type is 8 bytes, which can add up when dealing with many tables and/or many rows.

    There are pros and cons to each of these two methods for dealing with detecting inconsistent state related to UPDATE operations, so you would need to determine which method has more "pro" than "con" for your system. But in either case, you can avoid a delay between generating the Preview and doing the operation from causing behavior outside of the end-user's expectations.

  • If you are doing an end-user-facing "Preview" mode, then in addition to capturing the state of the records at select-time, passing along, and checking at modification-time, include a DATETIME for SelectTime and populate via GETDATE() or something similar. Pass that along to the app layer so that it can be passed back to the stored procedure (mostly likely as a single input parameter) so that it can be checked in the Stored Procedure. Then you can determine that IF the operation is not the "Preview" mode, then the @SelectTime value needs be no more than X minutes prior to the current value of GETDATE(). Maybe 2 minutes? 5 minutes? Most likely not more than 10 minutes. Throw an error if the DATEDIFF in MINUTES is over that threshold.