Sql-server – How to calculate required ressources from a baseline and what should be measured at all

monitoringsql serversql server 2014sql-server-2008

I have to deal with the following situatiuon:

Currently we have

1 cluster with 5 nodes running 10 instances (SQL Server 2008 Enterprise edition), and
1 cluster with 2 nodes running 5 instances (SQL Server 2014 Standard edition)

in use. All servers are virtual servers running in VMWare.

We want to setup a new cluster (SQL Server 2014 Enterprise). In a first step, 1 instance from the 2008 Enterprise cluster and 1 instance from the 2104 Standard cluster are to be migrated.

Therefore, my boss asked 2 questions:

How many cores do we need (aim: minimize license costs)?
How much RAM do we need?

My answer was: "It depends …" Now I have to deliver hard facts by monitoring the next few weeks. Great! (beware of irony)

My approach for question number 1:

Using perfmon.exe I plan to monitor

Processor\% Processor Time (_total and single cores),
Processor\% User Time (_total and single cores),
Processor\% Interrupt Time (_total and single cores) – is this really necessary? -, and
System\Processor Queue Length.

The question is, where to get these data from? From the node? From the SQL Server?

In the first case it should be easy: the first instance in question – vsql2008ent-1\instanceX for the sake of simplicity – is currently running on a node, let's name it node sql2008NodeA. No other instances, nor server should run under normal conditions on this node. So it should not matter where I get the data from, should it? In case of a disaster other instances will be running on this node, too. But we want to have a baseline for normal operation.

The second instance – vsql2014stan-1\instanceY – shares it's node – sql2014NodeA – with 2 other instances. In this case I can never be sure, how much cores the instance will truely need for smooth operation, right? So I can monitor the instance. But what does the result mean? It shows the CPU ressources actually used by this instance, only. But would more cores have been used if they were available? So what would be the answer to the question mentioned above?

RAM is the other question. Due to several disasters in the past when all instances landed on the same node I have set an upper limit for the maximum server memory for each instance. This limit depends on the available memory of the node (currently 100GB or 120GB respectively). So how to monitor this? If all memory is used up, everything seems clear: insufficient memory. If all goes slow: insufficient memory. But how much memory do I really need?

I try to summarize my questions:

Where should I get the measures from (node vs. server)?
Do I need to monitor the interrupt time, if I want to know the number of cores required?
What should I monitor under the given circumstances to answer the question, how much RAM I need (I know: "The more the better.")?

Thank you very much for your help!

Best regards!

Best Answer

The questions of whether/how to directly measure CPU core usage etc. are beyond my understanding, but here's what I'd consider trying:

Run a standard profiler trace with database name added, during your normally busiest period. Total up the CPU column for the SQL:BatchCompleted and RPC:Completed events by database, and you'll get a rough idea of how much CPU resources (which may be spread across multiple cores) each database is consuming. (Perhaps also total up the CPU column for the other events to see if anything major was missed. And save the trace "as trace table" for analysis.)
Exactly how to translate that to how many cores you'll need, I can't say. But if you also measure the total system CPU usage during the profiler run you might be able to estimate against the specific database's ratio of the total.

Note: If your server takes less than a few hundred batch requests a second (see SSMS activity monitor), then a standard profiler trace even across the network will almost certainly not affect performance. And if you instead script a server-side trace then more requests a second can be handled without slowing anything, but I make no promises for your environment.

For RAM, I wonder if http://www.sqlshack.com/sql-server-memory-performance-metrics-part-4-buffer-cache-hit-ratio-page-life-expectancy/ might help you determine if your instances need less/more. I don't think there's any way to do this by database though.

Related Solutions

Sql-server – Max Memory settings on Multi-Instance SQL Server 2008 R2 Cluster

You should absolutely make the most use of the hardware when you are in an optimal config, and adjust when you are in maintenance mode. And yes, you will have an issue while both (or all four?) instances are active on the same node. Since a failover induces a service start on the now-active node, you can adjust the max memory of each server in that event using a startup procedure. I blogged about this here, but for a different reason (failing over to a node with a different amount of memory):

https://sqlblog.org/2009/09/18/managing-multi-instance-cluster-failovers-with-different-hardware

Basically, you just need to check if both instances are on the same node (and this will require a linked server to be set up in both directions), and adjust accordingly. A very quick and completely untested example based on my blog post and assuming there is only one instance on each node at a time presently (the question is a bit ambiguous if you have 2 total instances or 4):

CREATE PROCEDURE dbo.OptimizeInstanceMemory
AS
BEGIN
   SET NOCOUNT ON;

   DECLARE
     @thisNode      NVARCHAR(255) = CONVERT(NVARCHAR(255),
                                  SERVERPROPERTY('ComputerNamePhysicalNetBIOS'),
     @otherNode     NVARCHAR(255),
     @optimalMemory INT = 12288, -- 12 GB
     @sql           NVARCHAR(MAX);

  SET @sql = N'SELECT @OtherNode = CONVERT(NVARCHAR(255), 
                        SERVERPROPERTY(N''ComputerNamePhysicalNetBIOS''));';

  EXEC [SERVER\INSTANCE].master..sp_executesql @sql, 
    N'@OtherNode NVARCHAR(255) OUTPUT', @OtherNode OUTPUT;

  IF @thisNode = @otherNode
  BEGIN -- we're on the same node, let's make everyone happy
    SET @optimalMemory = 6144;
  END

  SET @sql = N'EXEC sp_configure N''max server memory'', @om;
    RECONFIGURE WITH OVERRIDE;';

  EXEC                   master..sp_executesql @sql, N'@om INT', @optimalMemory;
  EXEC [SERVER\INSTANCE].master..sp_executesql @sql, N'@om INT', @optimalMemory;
END
GO

EXEC [master].dbo.sp_procoption 
  N'dbo.OptimizeInstanceMemory', 'startup', 'true';

Of course create it again on the other instance, swapping the linked server name used.

This gets a little more complex if you have to adjust depending on whether you are sharing the current node with 1, 2 or 3 other instances.

Note that this will cause other side effects such as clearing the plan cache (in the event when one of the instances didn't just restart or fail over, in which case the plan cache would be empty anyway), but these are arguably better than leaving both instances to assume they still have 12 GB of memory to play with - there will be a lot of thrashing if they're both heavily used.

You may also want to consider other options such as global maxdop, NUMA/CPU affinity etc. depending on how sensitive the system is to the amount of resources available.

SQL Server Memory Configuration – Setting Right Max Server Memory Value in Active/Active Cluster

Please note this is called multi instance cluster not active active actually. Your concern is correct about what would happen when both nodes are on same node, in that case you need a dynamic script would identify a failover and adjust SQL Server max server memory accordingly.

Let us focus on how to set max server memory correctly when both nodes are running on respective nodes for that please see my answer on SE thread What is Sensible way to calculate max server memory.

Now for scenario when failover happens and both instances are on same node. You have to use scripts from follwoing resources.

Now when both instances are on same node you would have to evenly distribute memory leaving enough for OS. In that scenario 29 G for each instance and 6 G for OS is what I would say would be good value. Because you have 2 instances running leaving few more gigs for OS would actually help.

Also note you should immediately come out of this situation as both SQL Server would be extremely slow.

Best Answer

Related Solutions

Sql-server – Max Memory settings on Multi-Instance SQL Server 2008 R2 Cluster

SQL Server Memory Configuration – Setting Right Max Server Memory Value in Active/Active Cluster

Related Question