VMware Cloud Community
andremostert
Contributor
Contributor

Host in Cluster is marked as "Not Responding"

Hi guys

I have been searching around the forums like mad. Here is my problem.

One of the host in the 4 host cluster is grayed out and marked as "not responding". Its ESX3.5 and VI client version 2.5

I had a change window last night to restart the mgmt-service but it did not make a change. I have also followed this web link to do the basic fault finding.

http://www.virtualizationteam.com/virtualization-vmware/vmware-vi3-virtualization-vmware/esx-host-di...

I have only issued this command:

service mgmt-vmware restart (DONE no change)

Next step will be to try to restart the following services (Question - does this affect the VM's on the host?)

service vmware-vpxa restart

service vmware-vmkauthd restart

service xinetd restart

As far as I can see it looks like my mgmt restart did not work as I tried this command to check if the date on the service

ps -ef | grep vmware-hostd

This showed my my PID ID and a date stamp. The date stamp was from the Sep06. If I run the kill -9 PID command it does not kill it as the time stamp stays the same.

Any other suggestion other than to reboot the host as I have 5 prod servers on it.

Cheers and thank you

0 Kudos
10 Replies
techsuresh
Enthusiast
Enthusiast

hi,

i got ur problem and i too faced the same issue week days back.

i documented every thing what i did for the problem.

find the attachment

Suresh.

0 Kudos
andremostert
Contributor
Contributor

Thanks for the info, I have tried all of the commands to get rid of the vmware-hostd service but no luck.

It looks like the reboot is the only way out. Is there a way to start the VM's from the command line? Or can you only start the VM's via the VI Client?

Andre

0 Kudos
techsuresh
Enthusiast
Enthusiast

we can power on the VMs in both ways.

is ur ESX server in HA/DRS cluster? so that the Vms already rebooted and vmotioned to other ESX servers while rebooting ur ESX machine. if it is not in cluster u have to manually start the VMs.

good luck.

Suresh

0 Kudos
andremostert
Contributor
Contributor

I cannot VI client connect to the host. I can only ssh to the host. So I will not be able to vmotion the servers.

I will have to power down each server via MSTSC and then issue a restart command on the host and hope it comes back online. If it does not restart I will power cycle the host via the HP ILO power button.

0 Kudos
techsuresh
Enthusiast
Enthusiast

no need to restart the Vms manually bcoz HA works even the ESX host disconnected from VC, because HA is purely independent from VC. so that it will take care about VMs restarting when you initiate reboot command in ESX box. before going to reboot ESX, check for the HA service runnig or not using "service vmware-aam status". if HA service is runnig fine then you can simply initiate reboot command on ESX no need of rebooting VMs manually on other ESX box.

Suresh

0 Kudos
andremostert
Contributor
Contributor

I did check the service now and the "vmware-aam" service is running.

I have a change control slot for this weekend and the afterhours guys will do it. I will update the post monday.

Andre

0 Kudos
techsuresh
Enthusiast
Enthusiast

i am not sure whether VMs will restart from other ESX server even HA agent is running on ur faulty ESX server. Because in my case it din't work. I am suspecting HA wont work if HOSTD service in hung(defunct) state.

You should be prepare for your virtual machines to be powered off for a long time. because in my case it has taken one hour to reboot ESX machine.

. Hope we will get a possitive update from you. Good Luck

Suresh

andremostert
Contributor
Contributor

Update,

The server was restarted after shutting down all the VM's. It restarted without any problems. The server joined the cluster and I checked it this morning and its all working perfect.

Thanks for the help mate.

0 Kudos
techsuresh
Enthusiast
Enthusiast

Thanks for updating Andre.

Suresh

0 Kudos
MauroBonder
VMware Employee
VMware Employee

the ip of service console respond normally ?

you can ping with hostname.domain ?

I.E SC IP 1.1.1.1

ping 1.1.1.1

ok

Hostname vmware.paloalto.com

ping vmware.paloalto.com

respond 1.1.1.1 ok.

If yes to two questions, check this Kb. http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=100371...

*If you found this information useful, please consider awarding points for "Correct" or "Helpful"*

*Please, don't forget the awarding points for "helpful" and/or "correct" answers. *Por favor, não esqueça de atribuir os pontos se a resposta foi útil ou resolveu o problema.* Thank you/Obrigado
0 Kudos