I have the same problem but it is very strange. I tried whatever methods but it works only if I
1) reread the licence file
2) remove HA
3) re-enable HA
--> OK.. but,
4) re-configure HA --> fail again!
i had same issue after upgrading VC to U3 from U2. On my 2 node cluster, first i tried disabling HA and reenabling it, but that failed. then i removed both HA and DRS, then enabled only HA and it worked. I then switched on DRS also and it remained ok.
I was getting this error after rebooting the host. Just entering Maintenance Mode and then exiting Maintenance Mode fixed it for me. It may have been because it would not enter Maintenance Mode prior to the reboot as it was acting flakey (thus the reason for the reboot).
Followed your advice of maintenance mode, remove from VC, re add host directly to cluster. Worked for me! Thanks!
I just wanted to add my 2 cents that we were seeing the same symptom as part of some other HA problems on 2 hosts in a 5 host cluster. For each host we did: Maintenance Mode => Remove from VC => Reboot Host => Re-add directly into clutser => Exit Maintenance Mode.
This was the final step in resolving our HA issues and it resolved the issue as described by the original poster as well.
Removing HA and adding it back worked for me. Removing node and adding it back did not.
Had 2 perfectly working ESXi 3.5 servers in a cluster. Rebooted the first host for some minor changes and HA worked fine when I reenabled it.
Tried the same thing on the second host and got the same error as everyone else on this post.
Tried everything mentioned above and none worked until I tried this:
Disable DRS and HA for Cluster
Enable DRS First and then Enable HA in that order and it is working fine again.
Hi, it just appended to me too. i needed to reboot before re-add to VC.
i'm on ESXi 3.5.0 130755
I tried your trick as well. Unfortunately it is still occurring. What's interesting is that after all that (putting the Hosts in Maintenance mode, removing them, recreating the cluster, reading them) I can individually take them out of maintenance mode and the service starts!!! I can even alternate and it works but with both of them on 1 or the other (depending on which one started first) the error occurs. Gotta love those illusive errors.
I have a EVC cluster with ESX 3.5update3 servers in it w/o HA enabled. I enabled HA and got the "cmd addnode failed..." error on 2 of the 6 hosts. After reading this thread, I disabled HA on the cluster and enabled HA again. This works fine and was very easy to do. I would suggest trying this as one of your first troubleshooting steps.
I had this same error and it turned out to be a licensing issue. I had to disable HA, re-start the license server and then enable HA.
I had the exact same error. It ended up being that I mistyped the hostname during the host build and had to rename the host. It took me a while to figure out because the dns name was correct and we caught it on a fluke.
To rename a esx server only find and rename the hostname in the files
The steps I used were this (and it worked):
I had this error on 1 of the 3 systems in a particular cluster
On each HOST in the cluster, Reconfigure for HA (do the system(s) having the error last). When all systems had been reconfigured, they were fine.
Hope this helps.
Check hosts file on every esx host for right configuration. See below, Remove line containing ":::1" and add rest of your ESX hosts.
Do not remove the following line, or various programs
that require network functionality will fail.
127.0.0.1 localhost.localdomain localhost
192.168.1.202 MAX2.BTEC MAX2
192.168.1.200 MAX1.BTEC MAX1
Then chose to reconfigure for VMware HA from host menu on every host in failing cluster
Rather than messing around with hosts file, I went to verify the DNS configuration, and found that it was set up incorrectly.
Fixing DNS settings and verifying DNS entries for all servers fixed it for me.