I am standing up a new ESXi cluster with 5 nodes. All on identical, new hardware (HP BL490c booting from SD card). So far, so good. Hardware burn-in went well.
For some reason one node is having HA problems. Despite several reconfigurations of HA, this one host is always showing following error:
"HA agent on hostname in cluster clustername in datacenter has an error: cmd addnode failed for primary node: Internal AAM Error - agent could not start: Unknown HA error"
So far, here is what I've tried:
- I have enabled, disabled HA several times.
- I have done "Reconfigure for VMware HA" on the troublesome host.
- I have confirmed NTP is properly configured on all hosts
- I have verified forward/reverse DNS on all hosts and vCenter
I am hesitent to do the unsupported tech support edit of the ESXi hosts file. I'd like to stick to a supported configuration if I can.
Anyone have any thoughts on this before I open a support case.
Thanks all - you're the best! Points (and a beer at VMworld) for your helpful answers.
Tristan
Try removing the host from the vCenter and adding it again.
Marcelo Soares
VMWare Certified Professional 310/410
Virtualization Tech Master
Globant Argentina
Consider awarding points for "helpful" and/or "correct" answers.
also, you may try restarting the management agents
http://kb.vmware.com/kb/1003490
...and confirm name resolution is setup properly within the entire environment.
Whenever I see this, nearly 90% of the time it's because the hostname isn't set on the ESXi host. If you click on configure HA task and look at failed task in tasks and events it will often give you a bit more information. Look for a message like "Failed to get hostname for localhost", or you can look at the /etc/hosts file from tech support mode and just verify that the host is able to resolve its own name successfully. Assuming that's the problem, the fix is just to change the DNS config to set the hostname correctly and reconfigure for HA again.
Removing / readding host didn't work.
Restarting management agents did't work.
Hostname properly configured on host (confirmed in DNS settings) and DNS configuration checks out. Also, I'm not seeing any "failed to get hostname" type of messages.
I'm going to look in tech support mode to see what's up.
Thanks all ~ you guys are awesome.