VMware Cloud Community
Portuguese_Bend
Contributor
Contributor

What might cause a vm to lose the vmtools heartbeat?

We have an old Windows 2003 Server vm that recently rebooted for no apparent reason. From what I can see, it shows that it was reset by vSphere HA because of the VMware Tools heartbeat failure.

I'm wondering what could cause the heartbeat to be lost?:smileyconfused:

Thanks.

Tags (2)
0 Kudos
2 Replies
hussainbte
Expert
Expert

Why wouldn’t VMware Tools send heartbeats and the VM stop generating IO? More than likely because the Guest Operating System on the VM has crashed (eg. Blue Screen of Death) or become otherwise very unresponsive.  At this point the best thing to do to keep the application as available as possible is to reset the VM.


I also found the below info regarding preventing false positive on some blog, I am not sure if this is the same in later versions of vCenter, most probably yes.


Since this feature relies on monitoring heartbeats through VMware tools there is the possibility of certain events happening that cause the heartbeat to stop longer than the configured failure interval, thereby triggering a false positive and resetting the VM. One example of these types of events is an upgrade to VMware Tools on a VM which causes heartbeats to temporarily stop while the VM is being upgraded. For this reason VMware changed the VMM feature in vCenter Server 2.5 Update 4 to also monitor the VM’s disk and network activity. Therefore even if no heartbeats are received within the failure interval, the VM does not reset unless no disk or network activity is detected for a predetermined I/O stats interval. A VM guest OS that is truly locked up will typically not have any disk or network activity in addition to the loss of heartbeat. This added level of monitoring will help eliminate false positives and make this feature even better. Additionally you can change the failure detection interval to higher than the default of 30 seconds. This setting is located in the HA advanced options section as shown below.

If you found my answers useful please consider marking them as Correct OR Helpful Regards, Hussain https://virtualcubes.wordpress.com/
0 Kudos
Paltelkalpesh
Enthusiast
Enthusiast

When you enable VM Monitoring, monitoring services evaluate whether each VM in the cluster is running by checking for regular heartbeat and I/O activity from the VM Tools process running inside the guest. If no heartbeat or I/O activity received, this is most likely because the guest operating system has failed or VMware tools is not being allocated any time to complete tasks. In such case, the VM monitoring service determines that the VM has failed and the VM is rebooted to restore services.

0 Kudos