Sql-server – Failed allocate pages: FAIL_PAGE_ALLOCATION 1

memorysql serversql-server-2012

We are on SQL Server 2012 SP3 Enterprise Edition on Windows 2012 R2.

I saw these errors in the sql logs:

Failed allocate pages: FAIL_PAGE_ALLOCATION 1

2016-06-14 04:28:27.44 spid175 Error: 701, Severity: 17, State:
123.

2016-06-14 04:28:27.44 spid175 There is insufficient system memory
in resource pool 'default' to run this query.

2016-06-14 04:28:27.44 Server Error: 17300, Severity: 16, State:
1. (Params:). The error is printed in terse mode because there was error during formatting. Tracing, ETW, notifications etc are skipped.

2016-06-14 04:28:27.44 Server Error: 17300, Severity: 16, State:
1. (Params:). The error is printed in terse mode because there was error during formatting. Tracing, ETW, notifications etc are skipped.

2016-06-14 04:28:27.44 spid131 Error: 701, Severity: 17, State:
123.

As far as I can tell, it looks like it ran out of memory at some point.
Is there a way to figure out what caused it to go out of memory?
The MEMORYCLERK_SQLQERESERVATIONS is pretty high, does anyone know what that is for?

MEMORYCLERK_SQLQERESERVATIONS (node 0)           KB
---------------------------------------- ----------
VM Reserved                                       0
VM Committed                                      0
Locked Pages Allocated                            0
SM Reserved                                       0
SM Committed                                      0
Pages Allocated                            22599824

EDIT: We have 32 gigs of RAM on the server and 28 is allocated to the SQL Server. The max memory setting is 28gb and min server memory is 8gb.

Here is a link to the ErrorLog output:
ErrorLog

Here is the link to the sys.dm_os_process_memory output:
Query Output

Link to the wait types: WaitTypes

I ran it during a time when we seem to have more memory usage:

SELECT * FROM sys.dm_exec_query_memory_grants where grant_time is null

Results: memory_grants

SELECT mg.granted_memory_kb, mg.session_id, t.text, qp.query_plan
FROM sys.dm_exec_query_memory_grants AS mg
CROSS APPLY sys.dm_exec_sql_text(mg.sql_handle) AS t
CROSS APPLY sys.dm_exec_query_plan(mg.plan_handle) AS qp
ORDER BY 1 DESC OPTION (MAXDOP 1)
Results: QueryMemGrants

SELECT top 50 t.text, cp.objtype ,qp.query_plan, cp.usecounts, cp.size_in_bytes >as [Bytes Used in Cache]
FROM sys.dm_exec_cached_plans AS cp
JOIN sys.dm_exec_query_stats AS qs ON cp.plan_handle = qs.plan_handle
CROSS APPLY sys.dm_exec_query_plan(cp.plan_handle) AS qp
CROSS APPLY sys.dm_exec_sql_text(qs.sql_handle) AS t
WHERE qp.query_plan.exist('declare namespace >n="http://schemas.microsoft.com/sqlserver/2004/07/showplan"; >//n:MemoryFractions') = 1
order by cp.size_in_bytes desc
OPTION (MAXDOP 1)

Results: CachedMemGrants

Select granted_query_memory,session_id,command from sys.dm_exec_requests
Results:dm_exec_requests

XML of query plan from one of the queries that was running:
QueryPlan

Best Answer

The output of errorlog had dbcc memorystatus dump and what I noticed was

Process/System Counts                         Value(in Bytes)
---------------------------------------- ----------
Available Physical Memory                1217605632---1.1 G
Available Virtual Memory                 140627167866880
Available Paging File                    5656502272
Working Set                               305238016
Percent of Committed Memory in WS                99
Page Faults                                27923310
System physical memory high                       0
System physical memory low                        0
Process physical memory low                       1--Memory Low
Process virtual memory low                        0
2016-06-14 04:28:27.41 Server    

Please note the available physical memory is very low. There was almost no memory in buffer pool

Regarding clerk which is consuming more memory

MEMORYCLERK_SQLQERESERVATIONS (node 0)           KB
---------------------------------------- ----------
VM Reserved                                       0
VM Committed                                      0
Locked Pages Allocated                            0
SM Reserved                                       0
SM Committed                                      0
Pages Allocated                            22599824  --21.5 G

Page Life Expectancy                             64

Now on server where max server memory is 28 G if MEMORYCLERK_SQLQERESERVATIONS is taking 21.5 G that is definitely a problem. This is what causing the OOM condition.

What is MEMORYCLERK_SQLQERESERVATIONS

This is a memory clerk in SQL Server which tracks memory allocated to query which involves Sort or hash operations during execution. These operators can be the largest memory consumers for a query.

Why OOM error due to this

When query involving sort and hash operations is executed it will make a reservation request based on the original query plan which contained a sort or a hash operator. Then as the query executes, it requests the memory and SQL Server will grant that request partially or fully depending on memory availability. There is a memory clerk (accountant) named ‘MEMORYCLERK_SQLQERESERVATIONS’ which tracks memory allocation to such requests . Now in your scenario following could be happening

  1. Query is requesting so much memory grant that SQL Server is only able to provide it a limited amount, this limited amount is called "Required Memory", so that it starts executing and while executing the query, because memory requirement was large and SQL Server cannot provide it as there was no memory in resource pool, the query fails with OOM error. The memory required when query is running is called "Additional Memory"

  2. There was Bug fixed in SQL Server 2012 Sp1 CU4 where query requested huge amount of memory grant causing it to be drastically slow or subsequently failing with OOM error. The possibility that bug resurfaced cannot be ruled out considering fact that QEReservations hogged all of the buffer pool

  3. Since the clerk has already taken 90 % of memory. Required Memory for new query is not available and query fails with OOM error.

  4. Your tables and indexes has skewed statistics which is forcing optimizer to build sub optimal plan causing it to request much more memory grant than actually required and in turn creating issues.

  5. Lastly the queries running on SQL Server requires some serious tuning.


As per This Blogs.msdn article

What Can a Developer Actually Do about Sort/Hash Operations?

Speaking of re-writing queries, here are some things to look for in a query that may lead to large memory grants.

Reasons why a query would use a SORT operator (not all inclusive list):

ORDER BY (T-SQL)

GROUP BY (T-SQL)

DISTINCT (T-SQL)

Merge Join operator selected by the optimizer and one of the inputs of the Merge join has to be sorted because a clustered index is

not available on that column.

Reasons why a query would use a Hash Match operator (not all inclusive list):

JOIN (T-SQL) – if SQL ends up performing a Hash Join. Typically, lack of good indexes may lead to the most expensive of join operators

– Hash Join. Look at query plan.

DISTINCT (T-SQL) – a Hash Aggregate could be used to perform the distinct. Look at query plan.

SUM/AVG/MAX/MIN (T-SQL)– any aggregate operation could potentially be performed as a Hash Aggregate . Look at query plan.

UNION – a Hash Aggregate could be used to remove the duplicates.

To further understand the problem I would require you to add output of below queries into your question. I would also like you to add output of Paul Randal Wait stats query. The source of query is This Blog, I suggest you to read the blog.

SELECT * FROM sys.dm_exec_query_memory_grants where grant_time is null

--Find who uses the most query memory grant:

SELECT mg.granted_memory_kb, mg.session_id, t.text, qp.query_plan
FROM sys.dm_exec_query_memory_grants AS mg
CROSS APPLY sys.dm_exec_sql_text(mg.sql_handle) AS t
CROSS APPLY sys.dm_exec_query_plan(mg.plan_handle) AS qp
ORDER BY 1 DESC OPTION (MAXDOP 1)

--Search cache for queries with memory grants:

SELECT t.text, cp.objtype,qp.query_plan
FROM sys.dm_exec_cached_plans AS cp
JOIN sys.dm_exec_query_stats AS qs ON cp.plan_handle = qs.plan_handle
CROSS APPLY sys.dm_exec_query_plan(cp.plan_handle) AS qp
CROSS APPLY sys.dm_exec_sql_text(qs.sql_handle) AS t
WHERE qp.query_plan.exist(‘declare namespace n=”http://schemas.microsoft.com/sqlserver/2004/07/showplan“; //n:MemoryFractions’) = 1

There are few other things I would like you to check for queries running on system.

Select granted_query_memory,session_id,command from sys.dm_exec_requests

This will show you how much memory is granted to queries running on the system.

If you can see XML actual execution plan you have MemoryGrant=xxxxx can you collect this value for costly queries.

All the above will show us if there is problem in query or some other issue as to why it is requesting so much memory for execution.


EDIT

From various query outputs you pasted.

enter image description here

You can see the requested_memory_kb for large number of queries are approx 5G, this is large memory grant, ideally it should be few MB's. Do note that required_memory_kb is just around 5 MB and granted_query_memory is NULL this is because due to memory pressure SQL Server is just able to provide minimum memory to start the query but not able to provide additional memory for query execution resulting query to fail with OOM error.

The query costs for queries requesting huge memory is also high which leads me to believe that either statistics are skewed or queries are written poorly. Other possibility would be query not supported by proper index. Number of queries requesting such a huge memory grant is good in number.

enter image description here

For above queries see granted_query_memory it is all in GB. The first 3 queries running used approx 15 G of memory which almost used 50 % of memory. In SQL Server millions of process run which require memory in some way so you can see if 3 queries are using 50% of available memory OOM issue is bound to occur.

Solution

You should seriously consider tuning the first 4 queries in above screenshot

Make sure you run index rebuild and stats update at least weekly so that skewed stats does not force optimizer to produce bad plan.

Use resource governor and create a resource pool and workload group and run queries which are requesting large memory grant in this pool. You can limit the memory request with parameter request_max_memory_grant_percentage. An example is shown in this Blog. This is just alternate method till you tune all your queries.