VMware Cloud Community
PKaufmann
Enthusiast
Enthusiast

HA - Display of Failover Capacity

Hi,

I have a strange problem with 2 of my esx-clusters.. I have three ESX Clusters, no.1 with 3 hosts, no.2 with 4 hosts and no.3 with 2 hosts.

When I click on esx cluster no.1, it shows on the right side:

Current Failover Capacity: 2 hosts

Configured Failover Capacity: 1 host

When I click on the other esx clusters, it shows: current failover capacity: 0 hosts, configured capacity: 1.

Also, the cluster with 4 hosts shows current failover capacity=0 ?!

That´s strange, and I don´t know how to correct this.. Do you have any ideas ???

thnx...

Philipp

Reply
0 Kudos
10 Replies
kumarkv
Enthusiast
Enthusiast

Hi

The current failover capacity shows the number of hosts you have in your cluster to support Host failover. You check the failover cpacity fin the spreadsheet. HA is claculated based on the VM that has the highest memory and also the memory on each ESX Host.

Try reconfiguring the memory for you Virtual Machine --> Bounce th VC Service and See if the value Changes. That should give a good idea.

regards

Kumar

If you find this helpful please award points

Cheers Kumar KV If you find this helpful don't forget to award points
Reply
0 Kudos
Rajeev_S
Expert
Expert

Hi Philip,

Sounds strange. Did u tried re-configuring Vmware-HA on the clusters. I would try that first.

Hope this helps

Rajeev

Reply
0 Kudos
PKaufmann
Enthusiast
Enthusiast

Yes, I have re-configured the HA and also the DRS several times without any changes 😕

Some weeks ago, everything was fine.. We did not create much virtual machines (I think there are 4 new vms, but they are not using much resources) ..

Reply
0 Kudos
PKaufmann
Enthusiast
Enthusiast

What do you mean by "Bounce th VC Service" ? Restarting the virtual center service ??

I used the excel sheed and the result is "HA Failover Capacity: -8" ^^

I think the problem is, that there are 1 VM with 6GB RAM and 1 with 4GB RAM.

All other vms have between 256MB and 2GB of RAM.

4 Hosts with 16GB Memory each, 30 vm´s (3 turned off) ..

Some weeks ago (as I wrote in the post above) everything was allright. The VM´s with the high amount of RAM were also running at this time. hmmm.

At this moment, the host memory usage is

Host1: 68%

Host2: 55%

Host3: 57%

Host4: 51%

On the Cluster, where everything is looking good (current failover capacity = 2 hosts), it looks like:

Host1: 55%

Host2: 48%

Host3: 56%

In this Cluster there are also running 2 VM´s with 6GB RAM and 2 with 4GB RAM. There 21 VM´s running in this cluster.

When I enter this data into the excel sheed, it shows -5 ^^ hmmmmm

best regards,

Philipp

Reply
0 Kudos
stormin2b1
Contributor
Contributor

Check your ESX hosts file located at /etc/hosts

Confirm that your host names are correct and that the virtual center is also listed.

Example:

  1. Do not remove the following line, or various programs

  2. that require network functionality will fail.

127.0.0.1 localhost.localdomain localhost

10.50.7.11 esx1.domain.com esx1

10.50.7.12 esx2.domian.com esx2

10.50.7.13 esx3.domain.com esx3

10.50.7.16 esx4.domain.com esx4

10.50.7.17 esx5.domain.com esx5

10.50.7.18 virtualcntr.domain.com wk-virtualcntr

0.vmware.pool.ntp.org

1.vmware.pool.ntp.org

2.vmware.pool.ntp.org

Then restart the agent from CLI on each host as root type service mgmt-vmware restart

Then enable HA and post an update if that does not work.

Reply
0 Kudos
PKaufmann
Enthusiast
Enthusiast

Hi, my /etc/hosts file looks like this on every esx host:

I manually added the entry for the virtual center (the last one...)

  1. Do not remove the following line, or various programs

  2. that require network functionality will fail.

127.0.0.1 localhost.localdomain localhost

172.16.2.201 rz2esx001.blue-net.lan rz2esx001

172.16.2.234 rz2esx002.blue-net.lan rz2esx002

172.16.2.248 rz2esx003.blue-net.lan rz2esx003

172.16.2.246 rz2esx004.blue-net.lan rz2esx004

172.16.2.244 rz2esx005.blue-net.lan rz2esx005

172.16.2.235 rz2esx006.blue-net.lan rz2esx006

172.16.2.237 rz2esx007.blue-net.lan rz2esx007

172.16.2.252 rz2esx008.blue-net.lan rz2esx008

172.16.2.250 rz2esx009.blue-net.lan rz2esx009

172.16.2.15 rz2vcs001.blue-net.lan rz2vcs001

I disabled the HA function, then I restarted the agent (mgmt-vmware restart), then enable HA through Virtual Center.

Result: While enabling the HA function, it shows: current failover capacity = -1 ^^. Then it switched back to 0. After finishing the HA configuration, it still shows "current failover capacity = 0" Smiley Sad

Another strange thing is that on the working cluster with failover capacity = 1, there is no virtual center entry in the hosts file ..

Any ideas ??

best regards,

Philipp

Reply
0 Kudos
RobBuxton
Enthusiast
Enthusiast

Philipp,

Did you resolve this? We're seeing the same problem after upgrading to VC 2.5. We have 6 hosts. All was working fine before the upgrade.

I've restarted most things I can think of, several more than once, but it's not worked so far.

cheers,

Rob.

Reply
0 Kudos
ac57846
Hot Shot
Hot Shot

Hi Rob,

Hope all is well at the council.

In ESX 3.5 the way HA capacity is deturmined has changed, HA now assumes a reservation of 256MB and 256MHz for any VM that has no reservation set.

Consequently the smallest HA "Slot" is around 300MB of RAM (256MB plus Virtualisation overhead). This will often be much larger than

Mike Laverick blogged about it and links to the VMware site with the advanced settings to change these assumed reservations.

That should allow you to get back to the situation you had before the upgrade.

You may also want to check my blog about another issue that may arise after upgrading DRS clusters.

Al.

Alastair Cooke

Reply
0 Kudos
RobBuxton
Enthusiast
Enthusiast

Al,

All is indeed well at the Council and thanks to your response it's even better as that was the issue.

I knew something must have changed and it did seem like a reservation issue, except w've never used reservations. Reduced it down to 150 and the failover capacity immediately increased.

Many thanks for the references.

cheers,

Rob.

Reply
0 Kudos
cxo
Contributor
Contributor

A little late to this thread, but we too were seing the same issue as Rob. The calculator (spreadsheet) eluded too earlier in this thread provided negative numbers in our "good" cluster and "bad" cluster. Changing the das.vm* values in the advanced HA settings (to 192 MB and 192 MHz) made my non-compliant cluster happy once again (that cluster had a 9GB VM).

Reply
0 Kudos