Does anyone know why HA seems to need to be reconfigured a lot? In my DRS cluster some of the esx servers show an alarm saying HA agent is not responding.
When i check the status on the esx servers I can see some servers in showing primary agent is running and some secondary. What is primary vs secondary and what condition would cause this?
Any help is appreciated.
I would look at your management to make sure there are no netwrok issues - in regards to the primary and secondary nnodes - primary nodes are the master nodes they maintain the catalog of all running VMs and hosts they are on - they are the nodes that initiate the cluster failover - secondary nodes just are members of the clusters -
If you find this or any other answer useful please consider awarding points by marking the answer correct or helpful
Make sure DNS is ok between the ESX hosts.
Delete the file FT_HOSTS from the host that is having issues (it contains esx host service console addresses for HA comms etc).
Restart hostd - service mgmt-vmware restart
Re-configure the host for HA