VMware Cloud Community
VMware_Communi7
Contributor
Contributor

Host and VMs go in not responding mode

Hi,

One of my 3 ESX servers goes in not responding mode and the configured HA doesn't move the VMs to other servers. Also all the VMs on that host go in a hung mode. None gives reply to pings.

If I try to reboot the service console thru init 6 , it also doesn't happen, and we need to hard boot the server.

I have ESX 3.5 update3 with VC 2.5. This problem comes any time of the day with a particular host.

The main ques is why is the host getting hanged and if the host resources are not available why are the VMs not moving thru HA.

Pls help with the cause and preventive measures.

Thanks

Pratima

0 Kudos
3 Replies
Lightbulb
Virtuoso
Virtuoso

Any entries in /var/log/vmkernel or /var/log/vmkwarning

Are their any third party agents or software running on this system i.e. HP management Netbackup client etc.

0 Kudos
weinstein5
Immortal
Immortal

When the host is not responding are the VMs still active and accessible? More than likely this is an issue with VC losing communication to that host - I would look at networkign as a possible cause -

If you find this or any other answer useful please consider awarding points by marking the answer correct or helpful

If you find this or any other answer useful please consider awarding points by marking the answer correct or helpful
0 Kudos
Erik_Zandboer
Expert
Expert

Hi,

More information is probably needed. To make sure: It is always the same ESX host that poses the problem? If so, I would recommend to down this host and run a memtest tool (for example memtest86) on it for at least 48 hours. Memory errors are often the cause for these kind of issues.

given the fact that this behaviour appears to be related to the box in that case, I would consider to do a fresh install if the memory checks out ok.

Why HA does not fail the VMs over? Hard to say. Perhaps the HA setting for default behaviour is "leave VM powered on" and some strange things happen to the host (maybe it does not release the virtual disks of the struck VMs for some reason).

Visit my blog at http://www.vmdamentals.com
0 Kudos