1 person found this helpful
i have no real solution to that, but reading the "reboot server" sentence made me post a reply.
when you stop the mgmt-vmware service send another stop for vmware-vpxa
additionaly do a "ps aux|grep hostd" and kill any such tasks, begin with the watchdog. then do a "ps aux|grep vpxa" and kill it the same way. (if any hung)
you then can start mgmt-vmware and vmware-vpxa in that sequence.
this might save most of the reboots.....
Your problem is DNS. I had this problem, and eventually figured out that is what is happening.
The hosts file on that machine should have the short and FQDN entries for that ESX host name. Then /etc/sysconfig/network should also be updated properly. Then the VC should be able to ping the FQDN for that ESXhost.
After doing those updates, delete the certificates and reissue new ones.
delete/rename the files in /etc/vmware/ssl and do service mgmt-vmware restart to reissue the certificates. If you do the updates in this order, you can save yourself a reboot and only restart the service once.
Verify that the host has the proper DNS entries for the name and IP and that VC communicates to that ESX host name and that ESX host can communicate via FQDN to the VC.
That should fix it.
I had DNS setup correctly. The only thing that I didn't have was the short name of the ESX host in the hosts file. I think the thing that got the ESX server to connect back in to VirtualCenter was deleting the ssl files and having the certificates reissued. That was much easier than doing a reboot (which is always the last resort).
I'm still at a loss as to why this happened in the first place. I'm not convinced that the lack of a short name in the hosts file would cause this. I had another ESX server from the same site disconnect yesterday after I posted this original message, but it was able to connect back to VirtualCenter without any issue, and it doesn't have its short name in the hosts file.
Thank again for the help.