VMware Cloud Community
mrchrisp
Contributor
Contributor

Maximum Recommended Memory Allocation for a VM before Disabling DRS

Hi there,

At what amount of memory allocated to a VM would you recommend changing DRS from Fully Automated to Partial, Manual or Disabled? This is on ESXi and vCenter 4.0 U2.

We have an Exchange 2010 host with 61GB of memory allocated, and if this machine is vMotioned it will grind to halt and take hours to be migrated. We therefore set DRS to Disabled for this VM. I'm wondering if we should actually disabled DRS for any VM over a certain allocated memory size as a standard. I understand that the rate of change also plays a factor here, but that is difficult to understand until the machine has been running for a time, at which point this issue may have already cropped up. Moving to 4.1 or 5.x is not an option at the current time.

Thanks

Chris

0 Kudos
4 Replies
Sreejesh_D
Virtuoso
Virtuoso

the recommended memory for the VMs without DRS would be,

Host total memory = VM1 + VM2+ VM3.

ie, the total memory of VMs would bot exceed the Host RAM.

0 Kudos
frankdenneman
Expert
Expert

Chris,

Interesting question! We do not have "thresholds" for guidelines when to select an automation level.

Selecting Disabled is the best automation level if you are not comfortable with DRS migrating virtual machines. I've seen you already chose that mode over the others. For more info about the impact of automation mode on DRS I would recommend to take a look at: http://frankdenneman.nl/2012/07/27/considerations-when-modifying-the-individual-vm-automation-level/

I have some questions regarding the behavior of the virtual machine and DRS;

Have you ever seen DRS generate a migration recommendation for this virtual machine? Are the other virtual machines in the cluster of similar size or are these completely the opposite?

DRS should weigh the Cost, Benefit and Risk of migrating this virtual machine. I expect DRS to pick other virtual machines when this virtual machine is surrounded by smaller virtual machines.

Have you monitored the active memory state of your exchange virtual machine? DRS takes active memory into account when selecting a virtual machine for migration. It can happen that the virtual machine has a high consumed memory state (caching) but a low active memory state.

Blogging: frankdenneman.nl Twitter: @frankdenneman Co-author: vSphere 4.1 HA and DRS technical Deepdive, vSphere 5x Clustering Deepdive series
0 Kudos
mrchrisp
Contributor
Contributor

Thanks for the info Frank. In answer to your questions;

  1. I have seen DRS attempt to migrate this VM once, and heard of it being migrated once previously. In hindsight, although I can't recall exactly, these migrations could have been when the host the VM was running on was placed in maintenance mode. Or the cluster could have been under pressure due to a lack of memory, which meant someone had set DRS to Aggressive.
  2. The majority of the cluster (around 80%), is made up of VMs under 16GB of allocated memory. So I'd presume there is plenty of smaller VMs for DRS to move first.
  3. Looking at the active memory state of this VM, it sits around a 1/3 of the consumed memory but at times I'd expect it to be higher.

So I presume DRS shouldn't generate migration recommendations for this machine, but if forced too it will, for instance when a host with the VM running on it is place in maintenance mode. I guess my question is slightly modified then; How would you recommend assessing the VMs on a host for those that should be cold migrated prior to the host being placed in maintenance mode? Is there a calculation I could run based on the active memory/page changes to see how long a vMotion would take?

We hope to take advantage of mutliple vMotion NICs in 5.1, so this problem should eventually go away as we can safely leave DRS Fully Automated for all VMs.

Chris

0 Kudos
frankdenneman
Expert
Expert

DRS takes the active memory into account of the virtual machine when determining which virtual machine to migrate. It will also look at the stabletime, meaning the time a virtual machine was using the same amount of memory. If a virtual machine memory usage fluctuates heavily, this candidate is seen as less optimal as DRS cannot predict the effect of a migration on the consumption of memory on the destination host, which in turn can affect the exisiting virtual machines.

Therefore determining which virtual machine will be moved by DRS is a challenge. Consumption, stable time of the last 60 minutes, constraints such as affinity rules and that for both CPU and Memory. Depening on which resources is at premimum state within the cluster (which is used more, cpu or memory) DRS decides what metric to weigh more.

When migrating virtual machines for a maintenance mode, DRS considers these migration mandatory. Mandatory moves need to be satisfied before manual moves,  and are always executed. DRS will take the overall load balance of the cluster into account and the individual host utilization. It tries to find the best suitable location for this virtual machine.

It's very difficult to calculate the cost of a vMotion, I recently published an article about the metrics involved: http://frankdenneman.nl/2012/12/04/calculating-the-bandwidth-usage-and-duration-of-a-vmotion-process...

Multi-NIC vMotion configurations are highly recommended, as vMotion will load balance single and multiple vMotion operations across the uplinks, providing you with reduced migration times. This in turn can provide DRS the ability to load-balance the cluster in smaller number of invocations.

Im currently publishing a series of articles on Multi-NIC vMotion network design, this might be interesting for you as well: http://frankdenneman.nl/2012/12/18/designing-your-vmotion-network/

Blogging: frankdenneman.nl Twitter: @frankdenneman Co-author: vSphere 4.1 HA and DRS technical Deepdive, vSphere 5x Clustering Deepdive series
0 Kudos