VMware Cloud Community
DrGorgo
Contributor
Contributor

No connection after restart of management agents

Hi all,

I have an issue after restarting the management agents on an ESX 4 server.

I restarted the management agents (service mgmt-vmware / service vmware-vpxa) as described in KB article 1003490 and from shell output that seemed to be successful.

But nonetheless vCenter server shows that server is not responding and I also cannot directly connect with vSphere client to that ESX server. All virtual machines on that server are up and running and do not seem to have issues.

Does anybody have an idea or advice regarding htis issue?

Thanks for any help in advance.

Kind regards

0 Kudos
16 Replies
AndreTheGiant
Immortal
Immortal

Can you ping the service console?

Try first to restart mgmt-vmware to see if you can connect directly to the host.

If you try to connect with a browser to the host what's happen?

Andre

Andrew | http://about.me/amauro | http://vinfrastructure.it/ | @Andrea_Mauro
0 Kudos
rManic
Expert
Expert

wait for some time and try reconnecting the ESX to vCenter again.

it is a chilly idea but will work

regards

Manic

Regards Manic
0 Kudos
DrGorgo
Contributor
Contributor

Hi Andre,

thanks for answering.

Acutally I can ping the server and the service console and I can also telnet to port 902 for example. But still I also cannot connect directly to the server via vSphere client.

And connection via browser is also not poosible.

Also a reconnect in vCenter does not work.

Regards

0 Kudos
AndreTheGiant
Immortal
Immortal

Seems that the hostd daemon does not work.

What's the output of this command?

netstat -anp | grep 443

Try do do:

/etc/init.d/mgmt-vmware stop

killall -9 vmware-hostd

/etc/init.d/mgmt-vmware start

Andre

Andrew | http://about.me/amauro | http://vinfrastructure.it/ | @Andrea_Mauro
rManic
Expert
Expert

I too had the same issue. I tried connecting after 30 to 45 min time it works.

this was happened me for several time.

I beleive this is becase of login histry or tmp file on our browser and vSphere client.

Please update once you succeded

regards

Manic

Regards Manic
0 Kudos
DrGorgo
Contributor
Contributor

Hi Andre,

Find attached the outcome of the netstat.

Sadly the killall command didn't sort the issue out either.

Any further ideas?

@rManic

I wish that would be the isssue but sadly I'm doing around here since nearly four hours which probably should be enough if this causes the problem.

Nevetheless your help is very much appreciated.

Regards

0 Kudos
AndreTheGiant
Immortal
Immortal

I suggest you to restart the host, when is possible.

This will free the port and remove this strange connections.

Andre

Andrew | http://about.me/amauro | http://vinfrastructure.it/ | @Andrea_Mauro
danm66
Expert
Expert

you're on the right track...if you can't login directly then your issue is with hostd.

do a 'service mgmt-vmware stop' , then 'ps -ef|grep hostd', you should only see the grep process. If not, then hostd isn' really stopping.

If it's stopping, then do 'service mgmt-vmware start' and verify that it's starting with the 'ps -ef|grep hostd' . Make sure the pid is not changing if you do a couple 'service mgmt-vmware status' commands a couple minutes apart.

If that's good and still not connecting with the client, then something is preventing hostd from starting completely or get processor time. Check /var/log/vmkwarning for messages. This log should be empty or nearly empty. If there are any recent messages, then post them up here and/or search on them. Also, do a 'df -h' to see if you have plent of disk space on all of the filesystems. None of them should normally be above 80% usage with a default setup.

If still not working, check /var/log/vmware/hostd.log for clues. Also, stop any management agents that would have been installed seperately, like hpasm if it's an hp box or dell's or IBM's software.

0 Kudos
DrGorgo
Contributor
Contributor

Hi danm66,

obviously the hostd is causing the issue.

If I follow your suggestion of stopping the mgmt-vmware service the grep shows that the hostd is still running as an uninterruptable process. Also the hostd.log shows that this service obviously is causing the problem.

Support - like Andre - already stated that this only can be solved by restarting the host which can only be done after having shut down all servers runnning on this box.

Thank you very much for your effort and help.

Kind regards

DrG...

0 Kudos
danm66
Expert
Expert

look at http://kb.vmware.com/kb/1007261 and see if that helps. This is assuming you have already tried to use the kill command to kill the hostd process.

0 Kudos
DrGorgo
Contributor
Contributor

Once again a good idea and a great link but still doesn't work. Smiley Sad

Watchdog states that the file is not found and consequently is not able to kill the process. (Even though the file is in place and the process obviously running.)

However I will schedule a reboot of the box and see if that solves the issue.

Thank you very much once more.

Cheers

DrG...

0 Kudos
danm66
Expert
Expert

One last thing, for my own benefit... did you try doing a kill on the hostd process? Just wondering, as I've only heard of it happening a couple times where it wouldn't die until a reboot was done.

0 Kudos
Tanav
Enthusiast
Enthusiast

hi,

i put the wrong solution in my earlier post. please find the correct solution below.

You may see this time to time, give it some time to reconnect. If it still fails to reconnect or respond, one thing you can do is to restart the management agent, somtimes it gets hosed and restarting will fix it.

service mgmt-vmware restart

You can also take a look at /var/log/messages, /var/log/vmkernel and /var/log/vmkwarning to see if there are any issues. One thing to not, if it's part of an HA/DRS cluster to ensure that full DNS resolution both forward/reverse is working, else you may see issues.

cannot connect the specified host. the host may not available on the network

restart the below services.

vmware-vpxa

mgmt-vmware

xinetd

vmware-vmkauthd

Check what services are running. Is your hostd service running?

service --status-all |grep running

Also try:

ps -ef | grep vmware-hostd

Do you see a /usr/lib/vmware/hostd/vmware-hostd running?

uninstalled the vpx agent and that seems to work. Here are the steps I used to uninstall and let VC reinstall vpxa:

Check for vpxa version: rpm -qa |grep vpxa

You should see something like VMware-vpxa-2.5.0-104215, you'll use this later

Stop the VMware management service: service mgmt-vmware stop

Stop the vpx agent: /etc/init.d/vmware-vpxa stop

Uninstall vpx agent: rpm -e VMware-vpxa-2.5.0-104215

Expect the following, ‘warning: /etc/vmware/vpxa.cfg saved as /etc/vmware/vpxa.cfg.rpmsave'

Verify vpxa has uninstalled: rpm -qa |grep vpxa (or vpx just in case)

Start the VMware management service: service mgmt-vmware start

Now go back into VirtualCenter and remove the disconnected host and add the host. Initially, it may fail with bad username or password or another error, but try again and it should work

0 Kudos
DrGorgo
Contributor
Contributor

Hi all,

and thank you very much for your support in this case!

But the only thing which finally worked out was a reboot of the server. This took place and now everything's up and running again without issues.

So one more big thank you for your effort and take care.

Cheers

DrG...

0 Kudos
hama007
Contributor
Contributor

hi,

i have the same problem and will restart my esx.

my problem is, i have 8 running vm´s on the esx!

can i move the vm´s to another esx?

how can i do this? with command line?

0 Kudos
DrGorgo
Contributor
Contributor

Hi hama007,

I did not found a different solution to that problem than powering down the VMs and then restarting the ESX server.

The problem is: without the agents theVMs cannot be moved to the other hosts in the cluster. As far as I understood together with VMware support there is no way to manage that. Neither via GUI nor console.

In the end it was a matter of scheduling a downtime for the affected servers.

Sorry for not having a more satisfying answer.

Cheers

DrG...

0 Kudos