VMware Cloud Community
Brijn
Enthusiast
Enthusiast

VM's did not fail over over host failure

Hello,

We have two hosts in our cluster. A few days ago one host went unresponsive (and as a result the VM's). In vCenter they where showing as "disconnected" when I started looking at the problem.

Whent he host went unresponsive (and showed as such in vCenter), shouldn't the VM's have been started up on the other host.

I don't have VM monitoring enabled, is that needed to get VM's to fail over when the underlying host dies?

Tips welcome!

Bas

0 Kudos
4 Replies
MarceloNY
Contributor
Contributor

Hi Bas,

Your supposition is correct, but you have to make sure that the HA and DRS are enabled in this cluster.

Have you been checked it?

Also, you have to make sure that the second ESX host has enough resource available to support the VMs from the

failed host.

Other details, are about the HA configuration. How is the HA configuration? What about the Virtual Machine Options?

Marcelo.

Marcelo Silva VCP4/CCA/ITIL3/MCITP/MCTS/MCSA
0 Kudos
Brijn
Enthusiast
Enthusiast

Hi,

Both HA and DRS are enabled, I can Vmotion hosts without issues and enough resources are available on the second host. Even without VM monitoring enabled I was expecting the VM's to come online on the second host shortly after the first host stops responding

Bas

0 Kudos
Troy_Clavell
Immortal
Immortal

just because your host(s) goes not responding/disconnected in vCenter does not mean there was an HA event. HA runs on the ESX hosts and works with the heartbeat of the service console. You can check the task and events tab of your cluster to get an idea of whether or not there was an HA event, but you would be better to check the ESX Host

HA agent logs: /var/log/vmware/aam

Configuration files: /etc/opt/vmware/aam

Also, here's a great blog, if you haven't seen it

finally if you do not have enough resources for failover, HA will not restart your guests by default. Here's a good way of calculating failover.

Brijn
Enthusiast
Enthusiast

Hi Troy,

Good reads! The info will help me check what is going on next time it happens!

Bas

0 Kudos