Sql-server – Merge row size overflow in SQL Server – “Cannot create a row of size..”

mergesql serversql-server-2012

The target table to which I'm trying to merge the data has ~660 columns.
The code for the merge:

MERGE TBL_BM_HSD_SUBJECT_AN_1 AS targetTable
USING
        (                
SELECT * 
FROM TBL_BM_HSD_SUBJECT_AN_1_STAGING
WHERE [ibi_bulk_id] in (20150520141627106) and  id in(101659113)
    ) AS sourceTable 
ON (...)
WHEN MATCHED AND ((targetTable.[sampletime] <= sourceTable.[sampletime]))
        THEN UPDATE SET ...
WHEN NOT MATCHED 
        THEN INSERT (...)
    VALUES (...)

The first time I ran this (i.e when the table is empty) it resulted in success,
and inserted one row.

The second time I ran this, with the same data set, an error was returned:
Cannot create a row of size 8410 which is greater than the allowable maximum row size of 8060.

Why the second time I tried to merge the same row which already was inserted it resulted in an error. If this row exceeded maximum row size, it would expect for it not to be possible to insert it in the first place.

So I tried two things, (and succeeded!):

  • Removing "WHEN NOT MATCHED" section from merge statement
  • Running an update statement with the same row I tried to merge

Why does update using merge is not succeeding, while insert does, and direct update also does?

UPDATE:

Managed to find the actual row size – 4978. I've created a new table that has only this row, and find the row size this way:
enter image description here

And I still don't see something exceeding the allowed limit.

UPDATE(2):

Full reproduce

Made an effort that this reproduce will not require any additional auxiliary objects, and that the data will be (somewhat) obfuscated.

Tried this on several servers, from version 2012, and one from 2008, and was able to fully reproduce in all of them.

Best Answer

Why the second time I tried to merge the same row which already was inserted it resulted in an error. If this row exceeded maximum row size, it would expect for it not to be possible to insert it in the first place.

First, thank you for the reproduction script.

The problem is not that SQL Server cannot insert or update a particular user-visible row. As you noted, a row that has already been inserted to a table certainly cannot be fundamentally too large for SQL Server to handle.

The problem occurs because the SQL Server MERGE implementation adds computed information (as extra columns) during intermediate steps in the execution plan. This extra information is needed for technical reasons, to keep track of whether each row should result in a insert, update, or delete; and also related to the way SQL Server generically avoids transient key violations during changes to indexes.

The SQL Server Storage Engine requires indexes to be unique (internally, including any hidden uniquifier) at all times - as each row is processed - rather than at the start and end of the complete transaction. In more complex MERGE scenarios, this requires a Split (converting an update to a separate delete and insert), Sort, and an optional Collapse (turning adjacent inserts and updates on the same key into an update). More information.

As an aside, note that the issue does not occur if the target table is a heap (drop the clustered index to see this). I am not recommending this as a fix, just mentioning it to highlight the connection between maintaining index uniqueness at all times (clustered in the present case), and the Split-Sort-Collapse.

In simple MERGE queries, with suitable unique indexes, and a straightforward relationship between source and target rows (typically matching using an ON clause that features all key columns), the query optimizer can simplify much of the generic logic away, resulting in comparatively simple plans that do not require a Split-Sort-Collapse, or Segment-Sequence Project to check that target rows are only touched once.

In complex MERGE queries, with more opaque logic, the optimizer is usually unable to apply these simplifications, exposing much more of the fundamentally complex logic required for correct processing (product bugs notwithstanding, and there have been plenty).

Your query certainly qualifies as complex. The ON clause does not match the index keys (and I understand why), and the 'source table' is a self-join involving a ranking window function (again, with reasons):

MERGE MERGE_REPRO_TARGET AS targetTable
USING
(
    SELECT * FROM 
    (
        SELECT 
            *, 
            ROW_NUMBER() OVER (
                PARTITION BY ww,id, tenant 
                ORDER BY 
                (
                    SELECT COUNT(1) 
                    FROM MERGE_REPRO_SOURCE AS targetTable
                    WHERE 
                        targetTable.[ibi_bulk_id] = sourceTable.[ibi_bulk_id] 
                        AND targetTable.[ibi_row_id] <> sourceTable.[ibi_row_id] 
                        AND 
                        (
                            (targetTable.[ww] = sourceTable.[ww]) 
                            AND (targetTable.[id] = sourceTable.[id]) 
                            AND (targetTable.[tenant] = sourceTable.[tenant])
                        ) 
                        AND NOT ((targetTable.[sampletime] <= sourceTable.[sampletime]))
                ),
                sourceTable.ibi_row_id DESC
            ) AS idx
        FROM MERGE_REPRO_SOURCE sourceTable 
        WHERE [ibi_bulk_id] in (20150803110418887)
    ) AS bulkData
    where idx = 1
) AS sourceTable 
ON 
    (targetTable.[ww] = sourceTable.[ww]) 
    AND (targetTable.[id] = sourceTable.[id]) 
    AND (targetTable.[tenant] = sourceTable.[tenant])
...

This results in many extra computed columns, primarily associated with the Split and the data needed when an update is converted to an insert/update pair. These extra columns result in an intermediate row exceeding the allowed 8060 bytes at an earlier Sort - the one just after a Filter:

The Problem Sort

Note that the Filter has 1,319 columns (expressions and base columns) in its Output List. Attaching a debugger shows the call stack at the point the fatal exception is raised:

Stack Trace

Note in passing that the problem is not at the Spool - the exception there is converted to a warning about the potential for a row to be too large.

Why does update using merge is not succeeding, while insert does, and direct update also does?

A direct update does not have the same internal complexity as the MERGE. It is a fundamentally simpler operation that tends to simplify and optimizer better. Removing the NOT MATCHED clause may also remove enough of the complexity such that the error is not generated in some cases. That does not happen with the repro, however.

Ultimately, my advice is to avoid MERGE for larger or more complex tasks. My experience is that separate insert/update/delete statements tend to optimize better, are simpler to understand, and also often perform better overall, compared with MERGE.