VMware Cloud Community
Titans99
Enthusiast
Enthusiast

Host disconnected from vCenter 5

Hi, I have a very small production cluster, and today I saw one of the hosts (along with the VM's) show disconnected.  I have tried the following

- I have restarted the management agents via the console.

- I can SSH in successfully

- I can telnet to port 902
- I can connect directly to it via the VMware client

- None of the VM's have more than one snapshot
- When I try to reconnect it to vCenter, I get the error:  "A general system error occurred: Timed out waiting for vpxa to start"

These are production VM's and they are still online. I can't vMotion them off because they show disconnected.  ESXi Build 5.0.0,515841

Any suggestions?

Reply
0 Kudos
11 Replies
vMario156
Expert
Expert

Did you also tried to remove the host and then to add it again or just clicking "reconnect" ?

Regards,

Mario

Blog: http://vKnowledge.net
Reply
0 Kudos
vMario156
Expert
Expert

Have you also done this checks between your vCenter server an the affectet host (ping, dns etc.) or just from another client / workstation?

Regards,

Mario

Blog: http://vKnowledge.net
Reply
0 Kudos
Titans99
Enthusiast
Enthusiast

Thank you for the reply. I performed all communication tests from the vCenter server with success.

I have not tried removing the host totally because a little nervous to do so without first asking, although after trying to reconnect and getting the error, the dialog box pops up as if it wants to add a new host. When I enter the hostname and credentials, I get the same error.

Reply
0 Kudos
rlund
Enthusiast
Enthusiast

Have you reviewed the host log files via console?

You have verified all service console settings , ie gateway etc are correct?

You could reinstall the ha agent. http://vcp4.wordpress.com/2010/03/20/re-installing-ha-agent-in-esxi/

Roger lund

Roger Lund Minnesota VMUG leader Blogger VMware and IT Evangelist My Blog: http://itblog.rogerlund.net & http://www.vbrainstorm.com
Reply
0 Kudos
Titans99
Enthusiast
Enthusiast

I removed the host and tried to re-add it - same error.  There is some stuff in the vxpa logs (attached screen shot) that I veiwed from the console screen, but I am not really sure what it means.  I guess I will have to schedule downtime to move off the VM's and maybe a reboot will help.

rlund, I did not attempt uninstalling the HA agents as that seems to be a different issue.

Reply
0 Kudos
Titans99
Enthusiast
Enthusiast

Anyone now if running "services.sh restart" will affect running VM's?  I'm trying to avoid having to reboot the host.

Reply
0 Kudos
Dave_Mishchenko
Immortal
Immortal

VMs aren't impacted when you run the command.

aravinds3107
Virtuoso
Virtuoso

Restarting the agent will not impact VM's, Host will be disconnected from vcenter and reconnected after some time.

In addition to this you can restart Management agent from DCUI

http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=100349...

If you find this or any other answer useful please consider awarding points by marking the answer correct or helpful |Blog: http://aravindsivaraman.com/ | Twitter : ss_aravind
Reply
0 Kudos
Titans99
Enthusiast
Enthusiast

Is running "/sbin/services.sh restart" from the command line the same as restarting the Management Agents from the DCUI, or does the services.sh restart more/different services?  Thank you

Reply
0 Kudos
nielse
Expert
Expert

Services.sh will give you output and show you which services start succesful and which fail.

Did you try this and if so what was the vpxa output?

@nielsengelen - http://foonet.be - VCP4/5
Titans99
Enthusiast
Enthusiast

I just ran "/sbin/services.sh restart" and afterwards was able to add the host back in to the cluster.  I did notice during the restart, the output showed vpxa was not running even though I restarted the management agents via the DCUI earlier.  Also when I re-added it, a dialogue box said "this host is currently being managed by 10.1.x.x.... do you want to proceed...."  (the IP it showed is the current IP of my vCenter server).  Anyway, I chose "proceed" and it seemed to work.

Hopefully it will stay!  Thank you guys for the assistance.

Reply
0 Kudos