SQL Server 2012 – Optimizing SQL Merge Statement with Primary Key Sorting

mergesortingsql serversql-server-2012

We are looking to use the SQL MERGE statement in SQL Server 2012 to handle data replication for a data warehouse that we have access to from the manufacturer of a 3rd party application we use for data entry.

The source of the data warehouse is an Oracle environment and we are using SQL Connector to reference the Oracle database in our environment as a linked server. The data is loaded/updated into the Oracle data source on in a once nightly dump.

We are looking at the MERGE statement due to unreliability/instability we have seen with transactional and snapshot replication.

Below is the table structure to do the merge on:

    CREATE TABLE [dbo].[FACT_WIP_2](
    [WIP_KEY] [float] NOT NULL,
    [PATIENT_KEY] [float] NULL,
    [PHARMACY_KEY] [float] NULL,
    [LINE_PHARM_KEY] [float] NULL,
    [DELIVERY_ADDRESS_KEY] [float] NULL,
    [INVENTORY_TYPE_KEY] [float] NULL,
    [PHYSICIAN_LOCATION_KEY] [float] NULL,
    [PRIMARY_ORIGINAL_INS_PLAN_KEY] [float] NULL,
    [PRIMARY_INSURANCE_PLAN_KEY] [float] NULL,
    [REFERRAL_PRIORITY_KEY] [float] NULL,
    [REIMBURSEMENT_STATUS_REF_KEY] [float] NULL,
    [REIMBURSEMENT_STATUS_STAGE_KEY] [float] NULL,
    [REFERRAL_SOURCE_TYPE_KEY] [float] NULL,
    [REFERRAL_START_DATE_KEY] [float] NULL,
    [STAGE_START_DATE_KEY] [float] NULL,
    [STAGE_END_DATE_KEY] [float] NULL,
    [ASSIGNED_EMPLOYEE_KEY] [float] NULL,
    [REFERRAL_COMPLETION_DATE_KEY] [float] NULL,
    [REFERRAL_CYCLE_KEY] [float] NULL,
    [SHIP_DATE_KEY] [float] NULL,
    [SHIP_MODE_KEY] [float] NULL,
    [PLACE_OF_SERVICE_KEY] [float] NULL,
    [REF_BUSINESS_DRIVERS_KEY] [float] NULL,
    [REF_OUTCOME_STATUS_KEY] [float] NULL,
    [STAGE_OUTCOME_STATUS_KEY] [float] NULL,
    [REF_REFERENCE_CATEGORY_KEY] [float] NULL,
    [THERAPY_GROUP_KEY] [float] NULL,
    [FORWARD_REASON_KEY] [float] NULL,
    [FORWARDED_TO_PHARMACY_KEY] [float] NULL,
    [IMAGE_RECEIPT_DATE_KEY] [float] NULL,
    [DATA_SOURCE_KEY] [float] NULL,
    [ORDER_START_DATE_KEY] [float] NULL,
    [DRUG_KEY] [float] NULL,
    [WORK_STAGE_KEY] [float] NULL,
    [REFERRAL_NEED_DATE_KEY] [float] NULL,
    [ORDER_NEED_DATE_KEY] [float] NULL,
    [WIP_TYPE] [varchar](1) NULL,
    [REFERRAL_ID] [varchar](32) NULL,
    [REFERRAL_LINE_NUM] [float] NULL,
    [ORDER_ID] [varchar](150) NULL,
    [ORDER_LINE_NUM] [float] NULL,
    [DIAGNOSIS_CODE] [varchar](10) NULL,
    [DIAGNOSIS_DESCRIPTION] [varchar](40) NULL,
    [QTY_WRITTEN] [float] NULL,
    [METRIC_QTY] [float] NULL,
    [DELETED_VOID_INDICATOR] [varchar](1) NULL,
    [DAY_SUPPLY] [float] NULL,
    [PROFILED_RX_INDICATOR] [numeric](1, 0) NULL,
    [PAT_PRIMARY_INS_CARDHOLDER_ID] [varchar](20) NULL,
    [PAT_PRIMARY_INS_EMPLOYER] [varchar](30) NULL,
    [PAT_PRIMARY_INS_GROUP_NUMBER] [varchar](30) NULL,
    [RX_NUMBER] [float] NULL,
    [REFILL_NUMBER] [float] NULL,
    [REFERRAL_START_TIMESTAMP] [datetime] NULL,
    [STAGE_START_TIMESTAMP] [datetime] NULL,
    [STAGE_END_TIMESTAMP] [datetime] NULL,
    [WORK_STAGE_ASSIGN_STATUS] [varchar](10) NULL,
    [REFERRAL_COMPLETION_TIMESTAMP] [datetime] NULL,
    [ORDER_START_TIMESTAMP] [datetime] NULL,
    [PRIOR_AUTH_APPEAL_INDICATOR] [varchar](20) NULL,
    [BV_MISSING_INFO_INDICATOR] [numeric](1, 0) NULL,
    [ORDER_COMPLETION_TIMESTAMP] [datetime] NULL,
    [SUPPLY_ITEM_INDICATOR] [numeric](1, 0) NULL,
    [FORWARDED_REFERRAL_INDICATOR] [numeric](1, 0) NULL,
    [NOGO_INDICATOR] [numeric](1, 0) NULL,
    [CREATE_UPDATE_DATE] [datetime] NULL,
    [ORDER_STATUS] [varchar](50) NULL,
    [REFERRAL_STATUS] [varchar](50) NULL,
    [VIRTUAL_STAGE_OUTCOME] [varchar](40) NULL,
    [VIRTUAL_STAGE_OUTCOME_REASON] [varchar](40) NULL,
    [UNIQUE_ID] [varchar](250) NULL,
    [STAGED_DATE_TIMESTAMP] [datetime] NULL,
    [REWORK_COUNT] [float] NULL,
 CONSTRAINT [MSHREPL_290_PK_2] PRIMARY KEY CLUSTERED 
    (
        [WIP_KEY] ASC
    )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
    ) ON [PRIMARY]

And this is the query we are using to complete our merge:

MERGE FACT_WIP_2 AS S
USING RXCSMAP..SMA.FACT_WIP AS O
ON (S.WIP_KEY = O.WIP_KEY)
WHEN MATCHED
AND O.CREATE_UPDATE_DATE > @DateToStartLooking
AND O.CREATE_UPDATE_DATE <> S.CREATE_UPDATE_DATE
THEN
UPDATE 
SET 
S.PATIENT_KEY = O.PATIENT_KEY,
S.PHARMACY_KEY = O.PHARMACY_KEY,
S.LINE_PHARM_KEY = O.LINE_PHARM_KEY,
S.DELIVERY_ADDRESS_KEY = O.DELIVERY_ADDRESS_KEY,
S.INVENTORY_TYPE_KEY = O.INVENTORY_TYPE_KEY,
S.PHYSICIAN_LOCATION_KEY = O.PHYSICIAN_LOCATION_KEY,
S.PRIMARY_ORIGINAL_INS_PLAN_KEY = O.PRIMARY_ORIGINAL_INS_PLAN_KEY,
S.PRIMARY_INSURANCE_PLAN_KEY = O.PRIMARY_INSURANCE_PLAN_KEY,
S.REFERRAL_PRIORITY_KEY = O.REFERRAL_PRIORITY_KEY,
S.REIMBURSEMENT_STATUS_REF_KEY = O.REIMBURSEMENT_STATUS_REF_KEY,
S.REIMBURSEMENT_STATUS_STAGE_KEY = O.REIMBURSEMENT_STATUS_STAGE_KEY,
S.REFERRAL_SOURCE_TYPE_KEY = O.REFERRAL_SOURCE_TYPE_KEY,
S.REFERRAL_START_DATE_KEY = O.REFERRAL_START_DATE_KEY,
S.STAGE_START_DATE_KEY = O.STAGE_START_DATE_KEY,
S.STAGE_END_DATE_KEY = O.STAGE_END_DATE_KEY,
S.ASSIGNED_EMPLOYEE_KEY = O.ASSIGNED_EMPLOYEE_KEY,
S.REFERRAL_COMPLETION_DATE_KEY = O.REFERRAL_COMPLETION_DATE_KEY,
S.REFERRAL_CYCLE_KEY = O.REFERRAL_CYCLE_KEY,
S.SHIP_DATE_KEY = O.SHIP_DATE_KEY,
S.SHIP_MODE_KEY = O.SHIP_MODE_KEY,
S.PLACE_OF_SERVICE_KEY = O.PLACE_OF_SERVICE_KEY,
S.REF_BUSINESS_DRIVERS_KEY = O.REF_BUSINESS_DRIVERS_KEY,
S.REF_OUTCOME_STATUS_KEY = O.REF_OUTCOME_STATUS_KEY,
S.STAGE_OUTCOME_STATUS_KEY = O.STAGE_OUTCOME_STATUS_KEY,
S.REF_REFERENCE_CATEGORY_KEY = O.REF_REFERENCE_CATEGORY_KEY,
S.THERAPY_GROUP_KEY = O.THERAPY_GROUP_KEY,
S.FORWARD_REASON_KEY = O.FORWARD_REASON_KEY,
S.FORWARDED_TO_PHARMACY_KEY = O.FORWARDED_TO_PHARMACY_KEY,
S.IMAGE_RECEIPT_DATE_KEY = O.IMAGE_RECEIPT_DATE_KEY,
S.DATA_SOURCE_KEY = O.DATA_SOURCE_KEY,
S.ORDER_START_DATE_KEY = O.ORDER_START_DATE_KEY,
S.DRUG_KEY = O.DRUG_KEY,
S.WORK_STAGE_KEY = O.WORK_STAGE_KEY,
S.REFERRAL_NEED_DATE_KEY = O.REFERRAL_NEED_DATE_KEY,
S.ORDER_NEED_DATE_KEY = O.ORDER_NEED_DATE_KEY,
S.WIP_TYPE = O.WIP_TYPE,
S.REFERRAL_ID = O.REFERRAL_ID,
S.REFERRAL_LINE_NUM = O.REFERRAL_LINE_NUM,
S.ORDER_ID = O.ORDER_ID,
S.ORDER_LINE_NUM = O.ORDER_LINE_NUM,
S.DIAGNOSIS_CODE = O.DIAGNOSIS_CODE,
S.DIAGNOSIS_DESCRIPTION = O.DIAGNOSIS_DESCRIPTION,
S.QTY_WRITTEN = O.QTY_WRITTEN,
S.METRIC_QTY = O.METRIC_QTY,
S.DELETED_VOID_INDICATOR = O.DELETED_VOID_INDICATOR,
S.DAY_SUPPLY = O.DAY_SUPPLY,
S.PROFILED_RX_INDICATOR = O.PROFILED_RX_INDICATOR,
S.PAT_PRIMARY_INS_CARDHOLDER_ID = O.PAT_PRIMARY_INS_CARDHOLDER_ID,
S.PAT_PRIMARY_INS_EMPLOYER = O.PAT_PRIMARY_INS_EMPLOYER,
S.PAT_PRIMARY_INS_GROUP_NUMBER = O.PAT_PRIMARY_INS_GROUP_NUMBER,
S.RX_NUMBER = O.RX_NUMBER,
S.REFILL_NUMBER = O.REFILL_NUMBER,
S.REFERRAL_START_TIMESTAMP = O.REFERRAL_START_TIMESTAMP,
S.STAGE_START_TIMESTAMP = O.STAGE_START_TIMESTAMP,
S.STAGE_END_TIMESTAMP = O.STAGE_END_TIMESTAMP,
S.WORK_STAGE_ASSIGN_STATUS = O.WORK_STAGE_ASSIGN_STATUS,
S.REFERRAL_COMPLETION_TIMESTAMP = O.REFERRAL_COMPLETION_TIMESTAMP,
S.ORDER_START_TIMESTAMP = O.ORDER_START_TIMESTAMP,
S.PRIOR_AUTH_APPEAL_INDICATOR = O.PRIOR_AUTH_APPEAL_INDICATOR,
S.BV_MISSING_INFO_INDICATOR = O.BV_MISSING_INFO_INDICATOR,
S.ORDER_COMPLETION_TIMESTAMP = O.ORDER_COMPLETION_TIMESTAMP,
S.SUPPLY_ITEM_INDICATOR = O.SUPPLY_ITEM_INDICATOR,
S.FORWARDED_REFERRAL_INDICATOR = O.FORWARDED_REFERRAL_INDICATOR,
S.NOGO_INDICATOR = O.NOGO_INDICATOR,
S.CREATE_UPDATE_DATE = O.CREATE_UPDATE_DATE,
S.ORDER_STATUS = O.ORDER_STATUS,
S.REFERRAL_STATUS = O.REFERRAL_STATUS,
S.VIRTUAL_STAGE_OUTCOME = O.VIRTUAL_STAGE_OUTCOME,
S.VIRTUAL_STAGE_OUTCOME_REASON = O.VIRTUAL_STAGE_OUTCOME_REASON,
S.UNIQUE_ID = O.UNIQUE_ID,
S.STAGED_DATE_TIMESTAMP = O.STAGED_DATE_TIMESTAMP,
S.REWORK_COUNT = O.REWORK_COUNT
--When no records are matched, insert
--the incoming records from Oracle Table
--to our SQL environment table
WHEN NOT MATCHED BY TARGET THEN
INSERT 
(
    WIP_KEY,
    PATIENT_KEY,
    PHARMACY_KEY,
    LINE_PHARM_KEY,
    DELIVERY_ADDRESS_KEY,
    INVENTORY_TYPE_KEY,
    PHYSICIAN_LOCATION_KEY,
    PRIMARY_ORIGINAL_INS_PLAN_KEY,
    PRIMARY_INSURANCE_PLAN_KEY,
    REFERRAL_PRIORITY_KEY,
    REIMBURSEMENT_STATUS_REF_KEY,
    REIMBURSEMENT_STATUS_STAGE_KEY,
    REFERRAL_SOURCE_TYPE_KEY,
    REFERRAL_START_DATE_KEY,
    STAGE_START_DATE_KEY,
    STAGE_END_DATE_KEY,
    ASSIGNED_EMPLOYEE_KEY,
    REFERRAL_COMPLETION_DATE_KEY,
    REFERRAL_CYCLE_KEY,
    SHIP_DATE_KEY,
    SHIP_MODE_KEY,
    PLACE_OF_SERVICE_KEY,
    REF_BUSINESS_DRIVERS_KEY,
    REF_OUTCOME_STATUS_KEY,
    STAGE_OUTCOME_STATUS_KEY,
    REF_REFERENCE_CATEGORY_KEY,
    THERAPY_GROUP_KEY,
    FORWARD_REASON_KEY,
    FORWARDED_TO_PHARMACY_KEY,
    IMAGE_RECEIPT_DATE_KEY,
    DATA_SOURCE_KEY,
    ORDER_START_DATE_KEY,
    DRUG_KEY,
    WORK_STAGE_KEY,
    REFERRAL_NEED_DATE_KEY,
    ORDER_NEED_DATE_KEY,
    WIP_TYPE,
    REFERRAL_ID,
    REFERRAL_LINE_NUM,
    ORDER_ID,
    ORDER_LINE_NUM,
    DIAGNOSIS_CODE,
    DIAGNOSIS_DESCRIPTION,
    QTY_WRITTEN,
    METRIC_QTY,
    DELETED_VOID_INDICATOR,
    DAY_SUPPLY,
    PROFILED_RX_INDICATOR,
    PAT_PRIMARY_INS_CARDHOLDER_ID,
    PAT_PRIMARY_INS_EMPLOYER,
    PAT_PRIMARY_INS_GROUP_NUMBER,
    RX_NUMBER,
    REFILL_NUMBER,
    REFERRAL_START_TIMESTAMP,
    STAGE_START_TIMESTAMP,
    STAGE_END_TIMESTAMP,
    WORK_STAGE_ASSIGN_STATUS,
    REFERRAL_COMPLETION_TIMESTAMP,
    ORDER_START_TIMESTAMP,
    PRIOR_AUTH_APPEAL_INDICATOR,
    BV_MISSING_INFO_INDICATOR,
    ORDER_COMPLETION_TIMESTAMP,
    SUPPLY_ITEM_INDICATOR,
    FORWARDED_REFERRAL_INDICATOR,
    NOGO_INDICATOR,
    CREATE_UPDATE_DATE,
    ORDER_STATUS,
    REFERRAL_STATUS,
    VIRTUAL_STAGE_OUTCOME,
    VIRTUAL_STAGE_OUTCOME_REASON,
    UNIQUE_ID,
    STAGED_DATE_TIMESTAMP,
    REWORK_COUNT
)
VALUES
(
    O.WIP_KEY,
    O.PATIENT_KEY,
    O.PHARMACY_KEY,
    O.LINE_PHARM_KEY,
    O.DELIVERY_ADDRESS_KEY,
    O.INVENTORY_TYPE_KEY,
    O.PHYSICIAN_LOCATION_KEY,
    O.PRIMARY_ORIGINAL_INS_PLAN_KEY,
    O.PRIMARY_INSURANCE_PLAN_KEY,
    O.REFERRAL_PRIORITY_KEY,
    O.REIMBURSEMENT_STATUS_REF_KEY,
    O.REIMBURSEMENT_STATUS_STAGE_KEY,
    O.REFERRAL_SOURCE_TYPE_KEY,
    O.REFERRAL_START_DATE_KEY,
    O.STAGE_START_DATE_KEY,
    O.STAGE_END_DATE_KEY,
    O.ASSIGNED_EMPLOYEE_KEY,
    O.REFERRAL_COMPLETION_DATE_KEY,
    O.REFERRAL_CYCLE_KEY,
    O.SHIP_DATE_KEY,
    O.SHIP_MODE_KEY,
    O.PLACE_OF_SERVICE_KEY,
    O.REF_BUSINESS_DRIVERS_KEY,
    O.REF_OUTCOME_STATUS_KEY,
    O.STAGE_OUTCOME_STATUS_KEY,
    O.REF_REFERENCE_CATEGORY_KEY,
    O.THERAPY_GROUP_KEY,
    O.FORWARD_REASON_KEY,
    O.FORWARDED_TO_PHARMACY_KEY,
    O.IMAGE_RECEIPT_DATE_KEY,
    O.DATA_SOURCE_KEY,
    O.ORDER_START_DATE_KEY,
    O.DRUG_KEY,
    O.WORK_STAGE_KEY,
    O.REFERRAL_NEED_DATE_KEY,
    O.ORDER_NEED_DATE_KEY,
    O.WIP_TYPE,
    O.REFERRAL_ID,
    O.REFERRAL_LINE_NUM,
    O.ORDER_ID,
    O.ORDER_LINE_NUM,
    O.DIAGNOSIS_CODE,
    O.DIAGNOSIS_DESCRIPTION,
    O.QTY_WRITTEN,
    O.METRIC_QTY,
    O.DELETED_VOID_INDICATOR,
    O.DAY_SUPPLY,
    O.PROFILED_RX_INDICATOR,
    O.PAT_PRIMARY_INS_CARDHOLDER_ID,
    O.PAT_PRIMARY_INS_EMPLOYER,
    O.PAT_PRIMARY_INS_GROUP_NUMBER,
    O.RX_NUMBER,
    O.REFILL_NUMBER,
    O.REFERRAL_START_TIMESTAMP,
    O.STAGE_START_TIMESTAMP,
    O.STAGE_END_TIMESTAMP,
    O.WORK_STAGE_ASSIGN_STATUS,
    O.REFERRAL_COMPLETION_TIMESTAMP,
    O.ORDER_START_TIMESTAMP,
    O.PRIOR_AUTH_APPEAL_INDICATOR,
    O.BV_MISSING_INFO_INDICATOR,
    O.ORDER_COMPLETION_TIMESTAMP,
    O.SUPPLY_ITEM_INDICATOR,
    O.FORWARDED_REFERRAL_INDICATOR,
    O.NOGO_INDICATOR,
    O.CREATE_UPDATE_DATE,
    O.ORDER_STATUS,
    O.REFERRAL_STATUS,
    O.VIRTUAL_STAGE_OUTCOME,
    O.VIRTUAL_STAGE_OUTCOME_REASON,
    O.UNIQUE_ID,
    O.STAGED_DATE_TIMESTAMP,
    O.REWORK_COUNT
)
--When there is a row that exists in our SQL table and
--same record does not exist in Oracle table
--then delete this record from our SQL table
WHEN NOT MATCHED BY SOURCE
THEN
DELETE;

Here is the execution Plan:
Execution Plan

What options for either query configuration or table structure (indexes, statistics, etc…) could we make on our end to try and optimize this process?

Best Answer

Have you looked at the execution plan? It seems to me that this kind of query will have to pull back the entire table from Oracle over the linked server before doing the merge locally. How big is that table? This would probably be the "slow" part.

Modify the source part of the merge to only bring back new or modified records; then do another statement to only bring back the keys in order to do the deletions. It still has to scan everything (twice) but it means shuffling far less data.