I'm guessing the 'memory left' output you posted is from Christian Bolton's VAS usage analysis script. Is that correct? If so, you have only 17-18 MB available in the memtoleave
area, which is likely to cause a problem at some point and could cause all or most of the error messages that you pasted.
This query from Jonathan Kehayias here will give you an idea of what is using memory in the memtoleave
area:
SELECT type, virtual_memory_committed_kb, multi_pages_kb
FROM sys.dm_os_memory_clerks
WHERE virtual_memory_committed_kb > 0 OR multi_pages_kb > 0
The post also explains memtoleave
well.
Whether the size of the caches you asked about is a problem depends on a number of things. Since this is a 32-bit instance, one of the more relevant things to consider is whether the allocations for those caches are allocated using the single-page allocator, which uses the buffer pool, or the multi-page allocator, which prior to SQL Server 2012 doesn't use the buffer pool and instead consumes memory in the memtoleave
area. You have about 13 GB allocated to the buffer pool, so a few hundred MB allocated to this or that isn't necessarily going to cause a problem. However, a few hundred MB in the memtoleave
area is enough to cause some of the errors you listed.
Kehayias' post explains these concepts as well as your options for solving the problem if it is indeed memtoleave
exhaustion.
I would focus mainly on what is starving the memtoleave
area, as well as what overall memory availability in Windows looks like, along with general measures of memory availability in SQL Server such as page life expectancy.
I would also consider whether the problem helps make a business case for migrating to a 64-bit version of SQL Server, as this avoids memtoleave
issues unless the whole server is simply running out of memory. Understanding and managing SQL Server's memory usage is significantly simpler on 64-bit builds, especially on SQL Server 2012 where all allocations go through the buffer pool. You may also want to consider applying SP4 as there are a number of known bugs in SP3.
Best Answer
Update: One of your comments raised a really good point that I sadly didn't even think of first. This is actually a pretty common place to look at - the higher your MAXDOP - the more threads you can use. The more you can use, the more memory that can be consumed because you have more going on at once inside of a query - more parallel thread. You said your MAXDOP is 10 server wide. This "feels" high. The latest guidance is 8 if more than 8 logical cpus, or something between "0 and N" if you have less than 8. I tend to look at the number of cores per NUMA node and may go lower than 8 for some systems that are running OLTP workloads based on the number of cores per NUMA node, and sometimes I go higher. But you might look at these guidelines and consider looking at your instance wide MAXDOP and consider changing it and testing to see the impact. Going lower than 10 will reduce your memory grants for these queries.
This doesn't eliminate the need to tun queries, however, and I think there is probably room for that. Basically you need to either continually increase memory until you can satisfy all your grants, consider lowering MAXDOP, tune queries so they consume smaller memory grants, consider reducing the workload - lower the amount of queries executed at once, or get into Resource Governor settings which can artificially lower the grant amount but most assuredly increase TempDB dependence which the memory grants (memory grant % especially).
Those are the main approaches I take. If you can, starting with query tuning and perhaps considering MAXDOP changes are a good place to start. I tend to leave Resource Governor as a final resort, have good luck with tuning queries and playing with server wide or query specific DOP.
Original Answer Query memory grants are a part of SQL Server helping to ensure the right amount of memory is available before running a query. It is a way to try and make sure that you aren't running an excessive amount of high memory consuming queries that will all but guarantee a lot of memory consuming operations spilling to TempDB.
In this case I imagine your Top 3 queries are consuming a lot of memory. And you were right to look at their plans. Hashes and Sorts are probably one of the issues here. Looking at tuning queries so they are appropriate in their plans and, in turn, memory consumption is a great solution here. I doubt this is a bug - it is the way SQL Server was designed.
This article is a bit old, but the knowledge contained within still applies. It is a good place to begin understanding query memory grants.
If tuning the queries doesn't save the day here, there are other tricks that can be employed. I've also asked some questions in the comments section of your question that may help expand an answer from me or someone else.