VMware Cloud Community
stogner
Contributor
Contributor

Fail over time?

Hello,

I have not been able to find this any where, but does any one know how long it takes for vmware to see a host is dead when it is configured for HA?

0 Kudos
3 Replies
admin
Immortal
Immortal

Failover usually occurs in 60 - 90 seconds in my experience

0 Kudos
esiebert7625
Immortal
Immortal

Host failure detection occurs 15 seconds after the HA service on a host has stopped sending heartbeats to the other hosts in the cluster. A host stops sending heartbeats if it is isolated from the network. At that time, other hosts in the cluster treat this host as failed, while this host declares itself as isolated from the network.

By default, the isolated host powers off its virtual machines. These virtual machines can then successfully fail over to other hosts in the cluster. If the isolated host has SAN access, it retains the disk lock on the virtual machine files, and attempts to fail over the virtual machine to another host fails. The virtual machine continues to run on the isolated host. VMFS disk locking prevents simultaneous write operations to the virtual machine disk files and potential corruption.

If the network connection is restored before 12 seconds have elapsed, other hosts in the cluster will not treat this as a host failure. In addition, the host with the transient network connection problem does not declare itself isolated from the network and continues running.

In the window between 12 and 14 seconds, the clustering service on the isolated host declares itself as isolated and starts powering off virtual machines with default isolation response settings. If the network connection is restored during that time, the virtual machine that had been powered off is not restarted on other hosts because the HA services on the other hosts do not consider this host as failed yet. As a result, if the network connection is restored in this window between 12 and 14 seconds after the host has lost connectivity, the virtual machines are powered off but not failed over.

http://download3.vmware.com/vmworld/2006/tac9413.pdf

http://kb.vmware.com/KanisaPlatform/Publishing/894/2956923_f.SAL_Public.html

http://www.vmware.com/pdf/vmware_ha_wp.pdf

If you find this post helpful, please award points...thanks

0 Kudos
stogner
Contributor
Contributor

Thank you.

0 Kudos