I'd like to simulate one of our ESX hosts going offline to see how the cluster handles the load. What is the VMware-recommended way to do this? Pull the network cables out?
Basically I'm looking to simulate something like a complete loss of power to one of the hosts.
Disconnect the service console network on your host - this is the nicest way to treat your hardware.
pulling the power plug will definitely do it. However, I would just do a graceful shutdown of ESX
Troy:
I know that I can do a graceful shutdown, but this is not a very good simulation of a power loss. A graceful shutdown also gives the HA cluster a chance to do a graceful handoff of all the guest VMs to another host.
I want to know exactly what to expect if one of the ESX hosts gives up the ghost abruptly during the middle of the day. We of course will do our testing during a maintenance window so as not to disrupt the users of these machines.
I'm thinking that unplugging all the network cables is probably the safest way to do this - does anyone else have suggestions or input on this topic?
thanks
Sean
yes, unplugging the network cables will work as well, by default. If you have any advanced HA options set for isolation, make sure you disable them as well.
Since you have the luxury to test HA failure, I would try different scenarios and for the blade servers, we just removed it from the enclosure or unplus the power cable that would truly tested from total outage, unplug the cable would do it too not truly power outage test.
If you found this information useful, please consider awarding points for "Correct" or "Helpful". Thanks!!!
Regards,
Stefan Nguyen
VMware vExpert 2009
iGeek Systems Inc.
VMware, Citrix, Microsoft Consultant
So, I could unplug both power supplies, this obviously would simulate a real failure. Are there risks of corrupting the virtual disk files that are on the machine?
there is always risk when your VM's go down hard. Here's what I would do, migrate all VM's off the ESX Host you want to test, disable DRS, create a couple test VM's on the Host, pull the power, and wait for HA to kick in.
This was you won't affect any production VMs.
Disconnect the service console network on your host - this is the nicest way to treat your hardware.
Thanks, java_cat33, this is the answer I was looking for.
hopefully you only have a single COS and your isolation is not set to "leave vm's powered on"
Good Luck!