VMware Cloud Community
admin
Immortal
Immortal

HA VM restart question on resource overcommitment

I think I know the answer but want to put this out there because I have seen and contributed to alot of posts on HA operation and admission policy / failover capacity calculations. I have a pretty good grasp of these concepts. I understand the concept of computing slot sizes and setting reservations and having a primary HA node restart VMs on hosts with sufficient capacity. HA should continue to restart VMs until capacity is exhausted and then move onto a new host.

My question is what constitues exhausted capacity for a host. We know a host can overcommit RAM and CPU using various techniques. I can manually power on VMs in a cluster without any reservations and cause a performance problem but they will start. Will HA stop powering on VMs for a specific host when its AVAILABLE RAM and CPU is exhausted or less than the slot size ? I believe the answer is yes but want to put it out there.

Thanks

0 Kudos
5 Replies
kjb007
Immortal
Immortal

When you speak about slot size, the slot size is calculated for the cluster. And you will have x number of slots available in your cluster. Once you have exhausted that number, your admission control policy will prevent any more virtual machines from powering on.

-KjB

vExpert/VCP/VCAP vmwise.com / @vmwise -KjB
0 Kudos
admin
Immortal
Immortal

I understand that but my questions is once slot size is calculated how is the host available resources calculated ? Using only available resources or does it allow or factor in things like overcommitting RAM and CPU ?

0 Kudos
kjb007
Immortal
Immortal

That's just it, the host isn't really calculating resources, per se. It creates a slot size, based on reservations, and or the default values, and figures out how many vm's can fit in your cluster. Once it does that, it only has to figure out how many vm's are running, and how many more can fit before it reaches it magic number. So, if you're not using reservations, you can change the default slot calculation values and HA will add "slots" accordingly. It is not doing an available resource calculation, per se.

-KjB

vExpert/VCP/VCAP vmwise.com / @vmwise -KjB
0 Kudos
admin
Immortal
Immortal

OK. The host is not calculating the resources but when will HA determine resources to be exhausted during a restart of VMS. I am not talking about Admission control I am talking about an actual failure event handled by HA and at what point will HA say that it should move on to another HOST and stop powering on VMs on the current host. Does it ONLY consider available resources on the ESX host or can it consider overcommitting RAM and CPU. Are you saying the SLOT size calculation will determine whether resources can be overcommitted or it should only use resources that are actually available ? When I say available I am referring to the concept of resource reservations on a VM where the VM will not power on unless the resources specificed in the Reservation are actually availble.

0 Kudos
kjb007
Immortal
Immortal

Ok, here's an example.

3 hosts - 2x3 GHz, 4 GB of ram each.

Total cluster = 18 GHz, 12 GB of ram

If you have one vm with a reservation of 2 GB of ram, then you will have 6 slots available in your cluster, and with a reservation of 1 GB, you will have 12 slots in your cluster. Once you have 12 vm's running, the 13th will not power on. In this case, it will not matter that out of those 12 GB, only 1.5 GB of ram are in use.

Second, without any reservation, you use the default values, ~ 256 MB, so using that, you will have about 46 slots, and the 47th vm will not power on.

So, using that, if one of your hosts dies, then the total left over will be (12 GHz, 8 GB of ram). Using the same two scenarios, you now have 4 slots (2 GB reservation) or 8 slots (1 GB reservation), or with no reservations, about 31 slots. The 32nd vm that tries to power on in the last scenario will fail.

I hope that makes sense.

-KjB

Last, you can modify that 256 mb value as well. So, by modifying that, you can force your cluster to increase the slot size, and in turn, you can create a scenario where you will be overcomitting resources.

Message was edited by: kjb007 : Added last comment.

vExpert/VCP/VCAP vmwise.com / @vmwise -KjB