>Based on this, and the lack of the alarm, we can assume that if I brought down one of the hosts in this 2-host cluster, all the VMs would be able to start on the single remaining host, correct? ...
See more...
>Based on this, and the lack of the alarm, we can assume that if I brought down one of the hosts in this 2-host cluster, all the VMs would be able to start on the single remaining host, correct? Correct, if one of your hosts crashed or was reset all your VMs would have resources enough to start (no guarantee of performance, just starting/running) >And the alarm I refer to in this thread's title will only get triggered when that 98% goes to 50% or below, correct? There is no alarm. When your CPU or Memory failover capacity reaches 50% admission control will prevent you from powering on another VM >I am still concerned about that resource distribution chart for Memory, though. It looks like I could push it up to 100% on both hosts, and still not push memory failover capacity lower than 50%. I presume this is due to the fact that my VMs don't have reserved memory, so vSphere only counts the minimal amount of memory required to start the VM, and it will depend on swapping and ballooning if all the VMs actually start using all their memory. Correct >So the alarm I'm looking for is one that replicates the resource distribution chart, and alerts when the total unutilized memory in the cluster is less than the total memory of a single host. Make sense? I'm not aware of an alarm like this. You may be able to find something through google or create something yourself. What about creating reservations for your VMs? Or at least the important ones? That way you can make sure they'll get the resources when they restart. Or if you really wanted you could change the admission control to the dedicated failover host policy? (with what I know of your situation I don't think I'd recommend this, but it would give you what you are asking for). Have you looked at the vRAS fling? VM Resource and Availability Service – VMware Labs This Fling enables you to perform a what-if analysis for host failures on your infrastructure. You can simulate failure of one or more hosts from a cluster (in vSphere) and identify how many: VMs would be safely restarted on different hosts VMs would fail to be restarted on different hosts VMs would experience performance degradation after restarted on a different host With this information, you can better plan the placement and configuration of your infrastructure to reduce downtime of your VMs/Services in case of host failures.