Hi, I have a very small production cluster, and today I saw one of the hosts (along with the VM's) show disconnected. I have tried the following
- I have restarted the management agents via the console.
- I can SSH in successfully
- I can telnet to port 902
- I can connect directly to it via the VMware client
- None of the VM's have more than one snapshot
- When I try to reconnect it to vCenter, I get the error: "A general system error occurred: Timed out waiting for vpxa to start"
These are production VM's and they are still online. I can't vMotion them off because they show disconnected. ESXi Build 5.0.0,515841
Any suggestions?
Did you also tried to remove the host and then to add it again or just clicking "reconnect" ?
Regards,
Mario
Have you also done this checks between your vCenter server an the affectet host (ping, dns etc.) or just from another client / workstation?
Regards,
Mario
Thank you for the reply. I performed all communication tests from the vCenter server with success.
I have not tried removing the host totally because a little nervous to do so without first asking, although after trying to reconnect and getting the error, the dialog box pops up as if it wants to add a new host. When I enter the hostname and credentials, I get the same error.
Have you reviewed the host log files via console?
You have verified all service console settings , ie gateway etc are correct?
You could reinstall the ha agent. http://vcp4.wordpress.com/2010/03/20/re-installing-ha-agent-in-esxi/
Roger lund
I removed the host and tried to re-add it - same error. There is some stuff in the vxpa logs (attached screen shot) that I veiwed from the console screen, but I am not really sure what it means. I guess I will have to schedule downtime to move off the VM's and maybe a reboot will help.
rlund, I did not attempt uninstalling the HA agents as that seems to be a different issue.
Anyone now if running "services.sh restart" will affect running VM's? I'm trying to avoid having to reboot the host.
VMs aren't impacted when you run the command.
Restarting the agent will not impact VM's, Host will be disconnected from vcenter and reconnected after some time.
In addition to this you can restart Management agent from DCUI
Is running "/sbin/services.sh restart" from the command line the same as restarting the Management Agents from the DCUI, or does the services.sh restart more/different services? Thank you
Services.sh will give you output and show you which services start succesful and which fail.
Did you try this and if so what was the vpxa output?
I just ran "/sbin/services.sh restart" and afterwards was able to add the host back in to the cluster. I did notice during the restart, the output showed vpxa was not running even though I restarted the management agents via the DCUI earlier. Also when I re-added it, a dialogue box said "this host is currently being managed by 10.1.x.x.... do you want to proceed...." (the IP it showed is the current IP of my vCenter server). Anyway, I chose "proceed" and it seemed to work.
Hopefully it will stay! Thank you guys for the assistance.
