SQL Server 2016 – Query Crashes Server

memorysql server

SQL Server 2016 continually crashes when running an ETL source query. Memory starts consuming at a fantastic rate then hits max and the service shuts off. This seems ridiculous, especially since the query runs perfectly on SQL Server 2014 with less memory and slower disks. <What could have changed and more importantly, is there a way to change server query or memory settings in 2016 that will prevent this crash scenario? Can anyone narrow the issue (and confirm it's a bug) or provide a more robust work around?

MAXDOP is set to 4, min memory is 8000 max is 24000 with 28gb on the system. The SQL 2014 machine had SharePoint running on it also (different SQL instance), and as a result has a lower max memory (12533mb). The SQL 2016 machine is an exact copy except it does not have the SharePoint install and was upgraded to SSDs.

The query only has 9 columns (6 INTs) for 7,122,737 records with two small table joins and minimal filtering… there's nothing that can change with the query and again it was working perfectly on SQL 2014. In fact, it runs a minute faster in SQL 2014 vs SQL 2016 despite the SSDs and almost twice the memory additions to a supposedly faster release!

enter image description here

So after a lot of searching, multiple different installs reverting to a database only install in default locations I have mostly identified the issue. One of the columns being hashed is a VARCHAR(MAX) which triggers a massive memory request from HASHBYTES. Not entirely unreasonable, even though the longest column is actually only 49 characters. Source system… work around will be to cast it dynamically I think.

If SQL Server reaches max memory, which it quickly arrives at with this query it crashes. This seems hit and miss, and may generate other errors instead, leaving the server up. Adding DISTINCT ensures the crash behavior unless excessive amounts of memory are present.

I've created a Microsoft Connect issue here if you have additional testing or ideas or can reproduce the issue. I'd say this is only partially the answer as it is still merely narrowed to a range of possibilities, that I don't have the time or capability to fully identify. However, it does not appear to be general memory pressure or a result of general SQL 2016 setup and services.

The following can reproduce the issue.

CREATE TABLE Testing (
    VarCharMAX VARCHAR(MAX) NULL,
    VarCharRealistic VARCHAR(100) NULL
) ON PRIMARY
GO

SELECT TOP 1000000
    IDENTITY(INT, 1, 1) AS Number
INTO Numbers
FROM master.dbo.syscolumns x
CROSS JOIN master.dbo.syscolumns
GO --Ensure 1 million rows were inserted

INSERT INTO Testing
        (VarCharMAX, VarCharRealistic)
SELECT
     CAST(Number As VARCHAR(MAX)) + 'a' AS VarCharMAX
    ,CAST(Number As VARCHAR(100)) + 'a' AS VarCharRealistic
FROM Numbers
GO

--Server crash under ~16GB memory
SELECT DISTINCT
    HASHBYTES('md5', VarCharMAX)
FROM dbo.Testing

Best Answer

I think I might have a legit answer .. and its a bug.

(similar was found when running checkdb in 2014 & fixed in SP1 + CU1)

When I run the query as is, SQL Server crashes

2016-06-16 17:27:32.00 spid37s Failed allocate pages: FAIL_PAGE_ALLOCATION 1

And in the error log I see :

MEMORYCLERK_SQLQUERYEXEC (node 0)                KB
---------------------------------------- ----------
VM Reserved                                       0
VM Committed                                      0
Locked Pages Allocated                            0
SM Reserved                                       0
SM Committed                                      0
Pages Allocated                            10678248  ==> 10GB !!
2016-06-16 17:27:32.00 spid37s     

With OPTION (MAXDOP 1) => serial execution

MEMORYCLERK_SQLQUERYEXEC (node 0)        KB
---------------------------------------- -----------
VM Reserved                              0
VM Committed                             0
Locked Pages Allocated                   0
SM Reserved                              0
SM Committed                             0
Pages Allocated                          144

Please test it in your environment and let me know.