Sql-server – Memory threshold is attempted on SQL Server

memoryprocesssql server

Currently we have a monitoring for all the infrastructure (server, DB, network components, etc …). The alert thresholds, usually saturation, are configured as follows:

90% : Warning
95% : Critical

Once the threshold is reached, an alert is generated and sent as an email or via a ticketing solution.
To be more concrete for our case, we have a server with the following config:

Server memory : 8 Gb
SQL max memory : 7 Gb

Once SQL reaches the max memory, if we also add OS processes, the threshold of 90% is reached, then an alert is generated.
Among the following solutions to avoid the alert, which is the most optimal:

Physical memory extension
Modification of the alert threshold from 90% to 95%
Empty the unused SQL chase (if possible)
Decreasing SQL max memory

Otherwise if you have other solutions do not hesitate to share it.

Best Answer

Sadly, sometimes we get less than helpful information to deal with throughout our busy days.

Q. What do Quantified Alerts tell us?

That service A is doing or enduring something.
That information is happening on a system.

Q. What will it take to Qualify my alerts?

Understand the difference between causation , correlation , unrelated events that a Quantified events have with a Qualified event. Merely having an alert about memory is the same as telling a Sports Athlete that he is using all his muscles during a sprint. It means nothing by itself.
Causation are events that directly contribute to impact of measurable scope. An example would be shutting down a service has direct effects on the users.
Correlation are events that relate to the impact as passive, precursors, or active consequences from the event. An example could be delays on the app server from consistently high Memory and CPU usage combined with Long IOs recorded on the Error logs, which one is the actual issue? The High usage may prevent the new connection, but the limits for current service may be related to the IOS. The passive result may be that some users experience downtime while others still have access (though slow).
Unrelated Events are issues that may or may seem like correlation or causation, but in reality are outliers that have no relevance. An example might be a user that has forgot his email but reports the service is down at the same time other users actually have impact to their services.

Conclusion

Think about the trends of events. Does High Memory usage occur during high load or when off hours occur? Are there bad queries being run at the same time or are many, or large queries happening? The first is an issue and the second and third are expected events on a large server.

At the end of the day, having high Memory usage itself is not necessarily a bad thing. Your server is being used. You need to determine measurable negative events in order to decide if the alert is good or bad.

Related Solutions

Sql-server – SQL Server 2008 R2 “Ghost Memory”

You won't get a true picture of memory usage from Task Manager if the account the service is running under has the lock pages in memory privilege (edit: as per Mark Rasmussen's comment/link). To determine how much memory is being used you can look at:

SQLServer:Memory Manager\Total Server Memory perfmon counter
DMVs

I can't recall if there is a DMV or combination of that will give you the total memory allocation but the following will show the bulk of it.

SELECT TOP(10) [type] AS [Memory Clerk Type], SUM(single_pages_kb) AS [SPA Mem, Kb] 
FROM sys.dm_os_memory_clerks 
GROUP BY [type]  
ORDER BY SUM(single_pages_kb) DESC OPTION (RECOMPILE);

SELECT DB_NAME(database_id) AS [Database Name],
COUNT(*) * 8/1024.0 AS [Cached Size (MB)]
FROM sys.dm_os_buffer_descriptors
WHERE database_id > 4 -- system databases
AND database_id <> 32767 -- ResourceDB
GROUP BY DB_NAME(database_id)
ORDER BY [Cached Size (MB)] DESC OPTION (RECOMPILE);

The second is the most interesting usually, buffer pool allocations by database. This is where the lions share will be used and it can be useful to understand which of your databases are the biggest consumers.

Sql-server – SQL Server 2005 SP3 Memory Errors when plenty of memory is seemingly available

The AWE mechanism in 32 bit process can only be used for data pages (buffer pool). It cannot be used for procedure cache, for query memory grants, for execution stacks, for access token cache, for CLR etc etc etc, basically all the other allocations other than data pages. All these allocations (including code pages) have to cram in the 2GB of the process address space.

Your only solution worth considering is moving to a 64bit SQL Server deployment on a 64bit OS. Everything else is a waste of time.

See Using AWE

The SQL Server buffer pool can fully utilize AWE mapped memory; however, only database pages can be dynamically mapped to and unmapped from SQL Server's virtual address space and take full advantage of memory allocated through AWE. AWE does not directly help supporting additional users, threads, databases, queries, and other objects that permanently reside in the virtual address space.

Best Answer

Related Solutions

Sql-server – SQL Server 2008 R2 “Ghost Memory”

Sql-server – SQL Server 2005 SP3 Memory Errors when plenty of memory is seemingly available

Related Question