VMware Cloud Community
rameshsangepu
Contributor
Contributor

HA gives "RED" mark very often

Hi

HA gives "RED" mark very often in esx 3.0.2 update1 cluster and issue will resolve when I reconfigure from VC. Any known issues and fixes for this behaviour since there are no issues either with network or ESX servers.

Reply
0 Kudos
7 Replies
rushto_jp
Contributor
Contributor

Ramesh, I guess you are talking about alarms getting triggered..from the alarm description, check whether its cpu or memory which is constrained and take it from there.

Reply
0 Kudos
kjb007
Immortal
Immortal

I had that issue previously. HA agent was known to restart every now and then, and sometimes not show correctly in vc. Check to make sure you registered the host with vc using FQDN and not IP. Also, it may be time to update your hosts, as that issue was resolved in later releases.

-KjB

vExpert/VCP/VCAP vmwise.com / @vmwise -KjB
Reply
0 Kudos
rameshsangepu
Contributor
Contributor

Its alarm only but not related to either CPU or memory. When I click 'reconfigure HA' it tries to reconfigure and it fails with "HA agent has been disabled". When I look at vpxa.log and /opt/LGTOaam512 logs says

ftcli -domain vmware -connect <servername> -port 8042 blah..blah...blah...blah

User Not Found

Access Denied

And also if any one has more details about below process please post....thanks

ftAgent

ftStateMon

ftProcmon

ftbb

ftRuleManager

Reply
0 Kudos
mescobarhms01
Contributor
Contributor

If you click on the cluster, and select the tab for summary, at the top there should be a yellowish box that says why it is alarming. Also, if it fails to reconfigure for HA on an individual host, in your recent tasks, click on the link and it should take you to a more fulfilling error message.

Reply
0 Kudos
kjb007
Immortal
Immortal

I would disable HA for the cluster, wait for everything to settle, and then re-enable. This shouldn't require any downtime. If that does not work, I would remove that server from the cluster, and then add it back in.

-KjB

vExpert/VCP/VCAP vmwise.com / @vmwise -KjB
Reply
0 Kudos
rameshsangepu
Contributor
Contributor

Yes the resolution is to remove HA cluster and enable which will reinstall HA package. This issue happens to 4-5 nodes of total 8 node cluster, though config is same for all nodes. Instead of remove and enable HA if we know why its happening then we can fix once for all.

Reply
0 Kudos
kjb007
Immortal
Immortal

It may be a problem with the release you are using. I haven't seen this issue in some time. I would suggest updating to a newere release of 3.0.2, if you can't yet go to 3.5.

-KjB

vExpert/VCP/VCAP vmwise.com / @vmwise -KjB
Reply
0 Kudos