VMware Cloud Community
nluckner
Contributor
Contributor

Automatic VM failover from one ESX host to another ESX host

Hello,

About a week ago I deployed a VMWare environment. This environment consists of:

ESX-HOST1

ESX-HOST2

We have vshpere running as a VM on ESX-HOST2.

I have 4 virtual machines running on ESX-HOST1 and 4 virtual machines running on ESX-HOST2. I thought I had my environment configured correctly where in the case of any type of failure on either ESX host, the VM's on the host that failed would automatically migrate to the online host. Currently, I can manually migrate a VM from one host to another.

The reason I ask this is because yesterday I had an IP conflict within one of my VM's on ESX-HOST1. That being said, because of this IP conflict the actual switch port on my Nortel switch shutdown and the entire ESX-HOST1 node was no longer available. I was under the assumption all of my VM's would of migrated automatically to ESX-HOST2 with minimal downtime.

What am I missing here?


Thanks

Nick

Reply
0 Kudos
5 Replies
jjkrueger
VMware Employee
VMware Employee

What you ran into was a failure scenario in vSphere HA called Host Isolation.

Host Isolation occurs when an ESXi host can no longer participate in heartbeat or election traffic, and cannot communicate with the default gateway. In vSphere 5, the default behavior for this is to leave the VMs running on the isolated host, as in many cases, the management network that we use for heartbeats is completely separate from the production network(s). The Isolation response can be changed in the HA cluster settings to Power Off the VMs (not graceful) or Shut down the VMs (graceful).

vSphere HA primarily protects against complete host failures, not network failures.

nluckner
Contributor
Contributor

And just to add, all networking is done over a single ethernet connection (management network, vsphere, client access). I was under the assumption that if we physcially lost an ESX unexpectidly, the online ESX host would take over the hosting.

Reply
0 Kudos
nluckner
Contributor
Contributor

If ESX-1 failed, lost network connectivity, etc, is there a way to configure ESX-2 to take over all the hosting roles? I was under the impression that going with two hosts, if we lost one due to hardware failure, network failure, etc that the other would take over all hosting.

Reply
0 Kudos
jjkrueger
VMware Employee
VMware Employee

Loss of a network connection is not loss of a host. Those are distinctly different events from the perspective of vSphere HA. Loss of the management network results in Host Isolation. Loss of a host results in HA restarting the affected VMs on a surviving cluster node.

If this is a production environment, I would first say that network redundancy is a necessity. That's how we will protect against a network port or cable failure.

If you set the Isolation Response to Shut Down or Power Off the VMs, HA should restart the VMs on the surviving cluster members.

Reply
0 Kudos
nluckner
Contributor
Contributor

Thanks very much for your help! I am going to test this out after making the change to shutdown.

Reply
0 Kudos