Have just upgraded VC to 2.5 and now 4 of my 13 Farms have this error. Haven't upgraded the ESX hosts yet - these are still running 3.0.2. Other than the VC upgrade nothing has changed. Anyone have any pointers as to why the upgrade has now thrown up this problem when it has been an issue before ?
Thanks for the suggestion but I have already tried that to no avail !
I can lose the warning simply by checking the Allow VMs to be powered on even if they violate availability constraints but I don't understand why it's now an issue with VC 2.5 when with 2.0.2 there was no problem.
VMware HA enforces failover capacity by using memory & CPU reservations
to estimate the number of VMs able to recover from multiple host failures (VMware HA: Overview & Technical Best Practices)
How many hosts in your cluster and how many host failures allowed?
I guess it is possible and VC 2.0.2 overlooked certain conditions maybe. I've been doing some comparisons between some Farms I still have running on 2.0.2 and then some Farms running ESX 3.5.0 on VC 2.5.0 and see some differences on Resource Allocation so I think I'm going to get these particular Farms upgraded to 3.5 and see what, if anything, that changes.....
I upgraded VC from 2.0.1 to 2.5.0 with 3.0.1 ESX hosts. Both of my HA clusters are giving the same symptoms as above: Insufficient resources to satisfy HA. I did not have any warnings while on 2.0.1. Support simply says I don't have enough resources. Any updates on this?
We have 2 HA clusters:
1. 2 Dell PE2850's with 12gb and 8gb of RAM with 17 VM's.
2. 4 Dell PE6850's with 32gb of RAM each with 70+ VM's.
Before the upgrade, our 6850 cluster showed failover capacity for 2 hosts. Now it shows 0. I'm interested to know if this will "correct" itself after upgrading the hosts to 3.5.
I have configuration cluster 2 esx 3.02 hosts and vc 2.5 and no problem.
Capacity of 0 hosts mean your cluster is not worked properly.
Try to REеnable HA.
First uncheck Enable VMware HA -> OK
Than check Enable VMware HA -> OK
try to restart vmware management service from host console.
service mgmt-vmware restart
The "problem" is that HA admission control got more conservative in VC 2.5 (see my comments in this thread http://communities.vmware.com/message/822784). If you have vms with different number of cpus in the same cluster (eg. some with 1 cpu and some with 2 cpus) then that will make the admission control overly conservative to ensure that there is enough unfragmented resources for all vms in the cluster in the case of a failover.
I noticed similar issues after upgrading to VC2.5 and ESX 3.5. Our equipment which had been properly spec'd for future growth suddenly was incapable of supporting HA on existing VM load.
We are running four Dell 6950s each with 32Gb memory and Quad Dual-core Opteron CPUs. From what I can tell, HA is being overly conservative in relation to memory since I can decrease memory allocations and failover capacity will be happy again. What I don't understand is... We have 128Gb of pooled RAM. If a host drops there will be 96Gb in the pool. We currently have 22 VMs so if you divide up the 96Gb of memory equally, they should all have 4Gb each. The allocations I have set are nowhere close to that. If I calculate our current memory allocations we are sitting at about 42Gb in total. I realise ESX will take some memory for itself but I didn't think it created this much overhead. That's 54Gb of memory that has disappeared into the ether! I assume it is not taking into account any VMs that are powered off?
Do you have vms in your cluster with different numbers of virtual cpus? As mentioned above, the HA admission control algorithm is more (ok, excessively) conservative in such cases to try and avoid resource fragmentation. We'll be addressing this problem in a future release.
Yes, there are 2 VMs that are running with 2 CPUs (out of necessity) the rest are all single. It does sound a little like a bug if it is requiring extra 10s of Gbs of memory to cope with such a situation. By "future release" do you mean a hotfix update or are we talking next major version?
I assume HA will still function correctly even if it shows no host failure tolerance?
It would help if you could move those 2 vms to a different cluster, but I understand that might not be feasible. Not sure exactly which release this will make it into, but it should be before the next major release. And yes, HA will still function correctly even if HA admission control says there is not enough resources. When failovers happen they only take into account the actual resources required by vms and the resources available on the host, not the conservative algorithm of HA admission control.
Is there an ETA on a "fix"?
Onsite, I have just had the fortune to be able to "redo" most of my hosts. I now have
- 7 (seven) HP DL385 servers in one cluster that has 28 total processors, 61GHz of processors, and 111Gb of RAM) for 51 virtual machines
- Of the 51, I have 25 which are dual vCPU and 26 single.
- My HA on this cluster is 2 (Awesome!)
- 6 (seven) HP DL585G5 servers (3 are quad quad-core, 6 are dual duad core (72 Processors), total of 159GHz of proc, and 420GB of RAM) for 63 virtuals
- Of the 63, 30 are single vCPU, 28 are dual vCPU, and 5 are quad vCPU
- My HA on this cluster is 0 (BYTES!!!