VMware Cloud Community
sean_carolan
Contributor
Contributor
Jump to solution

What is the safe and recommended way to simulate an HA cluster host failure?

I'd like to simulate one of our ESX hosts going offline to see how the cluster handles the load. What is the VMware-recommended way to do this? Pull the network cables out?

Basically I'm looking to simulate something like a complete loss of power to one of the hosts.

0 Kudos
1 Solution

Accepted Solutions
java_cat33
Virtuoso
Virtuoso
Jump to solution

Disconnect the service console network on your host - this is the nicest way to treat your hardware.

View solution in original post

0 Kudos
9 Replies
Troy_Clavell
Immortal
Immortal
Jump to solution

pulling the power plug will definitely do it. However, I would just do a graceful shutdown of ESX

0 Kudos
sean_carolan
Contributor
Contributor
Jump to solution

Troy:

I know that I can do a graceful shutdown, but this is not a very good simulation of a power loss. A graceful shutdown also gives the HA cluster a chance to do a graceful handoff of all the guest VMs to another host.

I want to know exactly what to expect if one of the ESX hosts gives up the ghost abruptly during the middle of the day. We of course will do our testing during a maintenance window so as not to disrupt the users of these machines.

I'm thinking that unplugging all the network cables is probably the safest way to do this - does anyone else have suggestions or input on this topic?

thanks

Sean

0 Kudos
Troy_Clavell
Immortal
Immortal
Jump to solution

yes, unplugging the network cables will work as well, by default. If you have any advanced HA options set for isolation, make sure you disable them as well.

0 Kudos
azn2kew
Champion
Champion
Jump to solution

Since you have the luxury to test HA failure, I would try different scenarios and for the blade servers, we just removed it from the enclosure or unplus the power cable that would truly tested from total outage, unplug the cable would do it too not truly power outage test.

If you found this information useful, please consider awarding points for "Correct" or "Helpful". Thanks!!!

Regards,

Stefan Nguyen

VMware vExpert 2009

iGeek Systems Inc.

VMware, Citrix, Microsoft Consultant

If you found this information useful, please consider awarding points for "Correct" or "Helpful". Thanks!!! Regards, Stefan Nguyen VMware vExpert 2009 iGeek Systems Inc. VMware vExpert, VCP 3 & 4, VSP, VTSP, CCA, CCEA, CCNA, MCSA, EMCSE, EMCISA
0 Kudos
sean_carolan
Contributor
Contributor
Jump to solution

So, I could unplug both power supplies, this obviously would simulate a real failure. Are there risks of corrupting the virtual disk files that are on the machine?

0 Kudos
Troy_Clavell
Immortal
Immortal
Jump to solution

there is always risk when your VM's go down hard. Here's what I would do, migrate all VM's off the ESX Host you want to test, disable DRS, create a couple test VM's on the Host, pull the power, and wait for HA to kick in.

This was you won't affect any production VMs.

0 Kudos
java_cat33
Virtuoso
Virtuoso
Jump to solution

Disconnect the service console network on your host - this is the nicest way to treat your hardware.

0 Kudos
sean_carolan
Contributor
Contributor
Jump to solution

Thanks, java_cat33, this is the answer I was looking for.

0 Kudos
Troy_Clavell
Immortal
Immortal
Jump to solution

hopefully you only have a single COS and your isolation is not set to "leave vm's powered on"

Good Luck!

0 Kudos