Sql-server – Are there ways to only replace SQL Server stored procedures when the definition has changed

sql serverstored-procedures

I know the following two patterns to write scripts for creating stored procedures, which can be executed repeatedly without throwing errors.

if object_id('my_proc') > 0 drop procedure my_proc
go
create procedure dbo.my_proc as Print 'This is not a dummy';

and the other which preserves permissions

if object_id('my_proc') is null
    EXEC ('create procedure dbo.my_proc as Print ''This is a dummy''');
go
ALTER PROCEDURE dbo.my_proc as Print 'This is not a dummy';
    

I guess when the procedure exists and its hash code is the same as that of the new version, than there would be no need to drop and recreate or alter the procedure.

My problem is that HashBytes is limited to a maximum of 8000 bytes and I can't use it generally like

if object_id('my_proc') is null
    EXEC ('create procedure dbo.my_proc as Print ''This is a dummy''');
go
if object_Id('my_proc') > 0
and
(
Select HashBytes('MD5',definition) MD5
from sys.sql_modules m
join sysobjects o on o.id = m.object_id
where o.name = 'my_proc'
) <> 0x9028A1B9D93AC7592EC939CCABF9D3DE
begin
    print 'definition has changed';
    EXEC ('ALTER PROCEDURE dbo.my_proc as Print ''This is not a dummy''');
end 

For procedures whose definition are longer than 4000 Characters. Any proposals to handle these cases in a similar way?

Edit:

It is not only that I want to avoid flushing of cached plans.

I also have to cope with different customers having different versions of a stored procedure where I only want to replace one of these variants by a newer version.

Best Answer

We recently had this discussion at my workplace.

First, I want to commend you for doing the "right thing" by using HASHBYTES() over CHECKSUM() to detect changes. Microsoft specifically cautions against using CHECKSUM(@input) for this purpose as its collision rate is very high compared to that of HASHBYTES('SHA1', @input). One advantage CHECKSUM() does have, though, is that there is no (obvious) restriction on the size of its input.

Second, if you use HASHBYTES() I recommend using SHA1 as your hash algorithm. Of the available options SHA1 has the lowest collision rate, and speed is not a concern for your use case.

Finally, To use HASHBYTES() against inputs larger than 8000 bytes you'll have to:

  1. Split your input into 8000 byte chunks.
  2. Hash each chunk.
  3. Somehow combine the resulting hashes and hash them to get your final output.

    You can do this in one of two ways:

    1. Convert your hash outputs into strings, concatenate them, and hash the result.
    2. Stick all your hash outputs into a memory table and take their aggregate checksum using CHECKSUM_AGG().
  4. Encapsulate this work as a function that takes NVARCHAR(MAX) as its input.

All that said, it is all-around simpler to just compare the proc definitions directly using OBJECT_DEFINITION() as gbn suggested, or to simply push all definitions out everywhere as often as you like, as Mike suggested.

I wonder what kind of environment would significantly benefit from a process that deployed only changed procedures and used hashes to avoid copying around and comparing full definitions. You'd need to have a lot of procedures to keep in sync.