Does anyone know how this scenario would be handled?
I have 3 ESX hosts each with 500GB of RAM
Host A has 1 VM (VM_A) with 450GB of ram
Host B has various VMs using a total of 200GB
Host C has various VMs using a total of 200GB
If Host A fails, VMA_ would need to fail over to either host B or C, but not enough room.
Would the vMotion:
A. start VM_A on Host B and after a period of time migrate the existing VMs to Host C, since Host B is now overloaded?
or
B. migrate VMs away from Host B to Host C, until there is enough resources for VM_A, then start VM_A on Host B?
I know you can choose that VM_A does not power on until there are sufficient resources, but is vMotion smart enough to make room for it before it is powered on?
I dont want to get into a situation where it starts on a Host which is overloaded, even for a short while, or it doesnt power on at all and requires manual intervention.
Hi
First of all pls read resources recommended by Scott
2nd You need to decide whether you are thinking about host evacuation (DRS event really) or host failure (HA event)
3rd there a lots of options that can decide about outcomes
Few remarks
Considering traditional scenario with no admission control and no reservation
In case of HA due to the Host A failure - big VM will be powered on at the first available host (B or C), and then DRS will start to move small VMs to the other host
Considering non traditional scenario with reservation
HA won't be able to power on big VM at the first attempt, but will notify DRS about that.
DRS in turn will attempt to make enough room by making the migrations (like from B to C)
HA makes regular checks and makes several attempts to power on VMs from failed hosts.
AFAIR last HA attempt is made like 30 min after failure in order to allow DPM to power on suspended hosts from stand by.
Sooner or later your hosts will have enough room to fit this big VM and power it on.
If you use reservation consider enabling admission control based on resource percentage.
It will prevent you from shooting yourself in the foot
Your scenario isn’t specific at all to vMotion, you’re actually asking about HA (failover) and DRS (compute resource management).
vMotion is merely the live migration mechanism used by the dynamic balancing function of DRS.
This may help: Using vSphere HA and DRS Together
As the article mentions the priority is availability, so that’s HA.
Unless your VMs are set with memory reservations, there will be “room” for your big VM to failover - it will just contend with the other VMs on whichever host is fails over onto.
It will then be down to DRS to balance the VMs across the 2 remaining hosts (using vMotion as necessary).
Assuming we use memory reservations, I'm wondering if we would run into a situation where it simply fails because 'not enough resources available' as opposed to moving things around to allow enough resources.
I guess the real answer is "test it and see"
HA would have prevented you from powering on all your VMs in the first place (even with all hosts available) if their failover could not be guaranteed.
Hi
First of all pls read resources recommended by Scott
2nd You need to decide whether you are thinking about host evacuation (DRS event really) or host failure (HA event)
3rd there a lots of options that can decide about outcomes
Few remarks
Considering traditional scenario with no admission control and no reservation
In case of HA due to the Host A failure - big VM will be powered on at the first available host (B or C), and then DRS will start to move small VMs to the other host
Considering non traditional scenario with reservation
HA won't be able to power on big VM at the first attempt, but will notify DRS about that.
DRS in turn will attempt to make enough room by making the migrations (like from B to C)
HA makes regular checks and makes several attempts to power on VMs from failed hosts.
AFAIR last HA attempt is made like 30 min after failure in order to allow DPM to power on suspended hosts from stand by.
Sooner or later your hosts will have enough room to fit this big VM and power it on.
If you use reservation consider enabling admission control based on resource percentage.
It will prevent you from shooting yourself in the foot
Ok thanks this is what I was looking for!
"Considering non traditional scenario with reservation
HA won't be able to power on big VM at the first attempt, but will notify DRS about that.
DRS in turn will attempt to make enough room by making the migrations (like from B to C)
HA makes regular checks and makes several attempts to power on VMs from failed hosts.
AFAIR last HA attempt is made like 30 min after failure in order to allow DPM to power on suspended hosts from stand by.
Sooner or later your hosts will have enough room to fit this big VM and power it on."