Sql-server – Specifying Order of Executing Stored Procedures

etlschemasql serverstored-procedures

Issue:

The order in which my stored procedures are called produces no errors if run independently, but when run as once, yields the following error:

Msg 2627, Level 14, State 1, Procedure sp_ETL_S1_tbl1, Line 40
Violation of PRIMARY KEY constraint 'PK__tbl1__EE1C17C10200000'. Cannot insert duplicate key in object 'S1.tbl1'. The duplicate key value is (0).
The statement has been terminated.

Hypothesis:

I believe this is due to the first stored procedure containing the DROP TABLE statements, and for some reason it doesn't get called first when all the stored procedures are run together.

Offending Code:

mssql
execute_sps.sql

USE db1;
GO
EXEC sp_ETL_db1_CreateSchemas;
   EXEC sp_ETL_dbo_tbl1;
   EXEC sp_ETL_dbo_tbl2;

Procedures above defined:

Stored procedure to create schema and tables.

sp_ETL_db1_CreateSchemas.sql

USE db1;

DROP PROCEDURE sp_ETL_db1_CreateSchemas;
GO
CREATE PROCEDURE sp_ETL_db1_CreateSchemas
AS

GO
DROP TABLE S1.tbl1
DROP TABLE S1.tbl2

DROP SCHEMA S1;
GO
CREATE SCHEMA S1;
GO
CREATE TABLE S1.tbl1(
    SalesGroupID       varchar(20)
        NOT NULL         PRIMARY KEY
,   SalesGroupType     int
,   SalesGroup         varchar(20)

)
;
GO
CREATE TABLE S1.tbl2(
    [NameID]                    int
        NOT NULL                    PRIMARY KEY
,   [FirstName]                 varchar(50)
,   [MiddleName]                varchar(50)
,   [LastName]                  varchar(50)
,   [LastUpdated]               datetime
)
;

Stored procedures to ETL data from (olddb.tbl1, olddb.tbl2) to (db1.S1.tbl1 , db1.S1.tbl2)

USE db1;
GO
DROP PROCEDURE sp_ETL_S1_tbl1;
GO
CREATE PROCEDURE sp_ETL_S1_tbl1
AS

SET NOCOUNT ON
IF object_id('tempdb.dbo.#SalesGroup') IS NOT NULL
BEGIN DROP TABLE #SalesGroup END

SELECT
    [SalesGroupID]
,   [SalesGroupType]     
,   [SalesGroup]         
INTO #SalesGroup
FROM olddb.tbl1
;

INSERT INTO db1.S1.tbl1(
    [SalesGroupID]       
,   [SalesGroupType]     
,   [SalesGroup]              
)
SELECT
    [SalesGroupID]    
,   [SalesGroupType]
,   [SalesGroup] 
FROM #SalesGroup
USE db1;
GO
DROP PROCEDURE sp_ETL_S1_tbl2;
GO
CREATE PROCEDURE sp_ETL_S1_tbl2
AS

SET NOCOUNT ON
IF object_id('tempdb.dbo.#Names') IS NOT NULL
BEGIN DROP TABLE #Names END

SELECT
    [NameID]
,   [FirstName] 
,   [MiddleName]    
,   [LastName]  
,   [LastUpdated]
INTO #Names
FROM olddb.tbl2
;

INSERT INTO db1.S1.tbl2(
    [NameID]                    
,   [FirstName]                 
,   [MiddleName]                
,   [LastName]                  
,   [LastUpdated]               
)
SELECT
    [NameID]                    
,   [FirstName]                 
,   [MiddleName]                
,   [LastName]                  
,   [LastUpdated]               

FROM #Names

Best Answer

SQL Server will execute the statements in a batch one at a time, from top to bottom. Each statement will complete before the next is executed. In your case the this means the first SP runs to completion, then the second one starts and runs to completion, then the third starts and runs to completion. There is no circumstance in which anything else can occur.

Your error is with how you define stored procedure sp_ETL_db1_CreateSchemas. You use the GO command as if it were part of the stored procedure definition. It is not. As the documentation says

GO is not a Transact-SQL statement; it is a command recognized by the sqlcmd and osql utilities and SQL Server Management Studio Code editor.

SQL Server utilities interpret GO as a signal that they should send the current batch of Transact-SQL statements to an instance of SQL Server.

When you run file sp_ETL_db1_CreateSchemas.sql through SSMS (or any other utility) the statements from the beginning of the file to the first GO are sent to the SQL Server engine. When that call returns the statements between the first GO and the second GO are sent, and so on. Eventually all the batches in that file are processed and the result is what you expect.

Or it is mostly what you expect. The definition of the stored procedure is just a stub consisting of the three lines

CREATE PROCEDURE sp_ETL_db1_CreateSchemas
AS

You can verify this by looking at what SQL Server has stored internally as the definition of that SP:

exec sp_helptext 'sp_ETL_db1_CreateSchemas';

You'll see just those three lines.

When you run the batch consisting of the three SPs together it is this empty stub which is executed. Syntactically it is correct so you do not get an error. But it has no statements so no tables are dropped and all the data remains in tact. Hence when a subsequent SP tries to re-load the same data there are key violations.

If you need to batch statements within a stored procedure you can use the EXECUTE('...'); syntax.