I have 3 ESX hosts in a cluster. 1 of them have a problem with the HA agent. I tired to "reconfigure for HA". It will configure successfully but it will disable it a short while later. PING to & from all 3 machines are ok, by name or IP. License seems to be ok. I can't figure out what's wrong with it.
All ESX hosts have local hosts file as well as resolving names via the DNS server running on the VC. VC also has a local hosts file. All hosts files are updated as well.
I don't know what else to check.
Thanks for the help.
Please to disable and re-enable HA for the cluster and remove and add the Host to the cluster again.
If you found this or other information useful, please consider awarding points for "Correct" or "Helpful".
And if that fails, the problem could be deeper:
ensure that all esx servers in the cluster have correct info in /etc/hosts regarding fqdn and netbios name. All hosts should be able to ping all other hosts by both methods.
ensure your service console has enough memory allocated. We found that our HA agent was crashing and/or a runaway process in a buggy version of 3.0.1 HA was causing the service console to run out of memory. In this case, HA and other services crashed an restarted periodically.
The above are the less likely, but totally possible level 2 or 3 troubleshooting scenarios. Let me know if you need more detail.
Also checkout /etc/ft_hosts , it contains a copy of the /etc/hosts file which isn't always updated. If you can't find a thing there are a couple of things you can try:
1 remove the host from the cluster and add it again
2 reboot the host
3 up the SC memory to around 800 (just to be surem as jdvcp mentioned)
My virtualisation blog:
Found the problem and why the stupid hostd keeps dying.
I made a backup copy of the "/etc/vmware/firewall/services.xml" in the same directory. So hostd / firewall keeps trying to load both copies.
Removed the backup copy, restart "S98mgmt-vmware".
HA is now working again, VI Client control & access, etc. etc.
Thanks to everyone who had given suggestion.