Hi,
We have 3 Hosts in HA and heartbeat is present between them via network as well as through HA Heartbeat using 2 datastores.
We have a physical network switch in the rack that needs to turned off for a brief period for an adapter change. It will be turned off for say 5 minutes.
Is there a chance this would cause any issues with the VMs ? I understand that during the switch power off internet will not be available and the services will be out of reach.
But will it auto resume once its powered back on ? Any chance that it will cause the Hosts to try HA and try to reboot the VMs? How long will it wait for HA Heartbeat?
Disable Host Monitoring before maintaining.
Disable Host Monitoring before maintaining.
Thanks alot Amir. I will do that.
In addition to the HA, Is there anything else I need to be cautious of that can cause any issues with the Host or VMs during the powercycle of Switch?
Basically the activity is simple.This particular switch doesn't have dual power. I just want to turn off, connect a power cable from dual power supply, turn it back on.
This setup is in production. Anything else I need to do so everything in the production will continue as is after switch is back, up and running ?
Regards
Disable vSphere HA as mentioned above, also, don't know what kind of storage you have, but if it is network attached and goes across the same switch then 5 minutes of downtime would trigger and "all paths down" which would impact all VMs, also something to take in to consideration!
Thanks depping.
What we have is Fibre Channel and iSCSi based connections from the datastores to the hosts directly. Switch is not involved in it.
However there is a NAS which is mounted as a datastore connected via switch. So I will have to unmount this datastore first before I turn off the switch?
Also I have never encountered an 'All Paths Down' so far. Will it recover automatically once the switch is back up and running? Or I will have to go to configuration of each host and remount the storage devices?
Depends on how you configured HA. if HA is configured to respond to APD or PDL it will kill those VMs and restart them if possible. An APD is declared when there's no storage connection for > 140 seconds. At that point IO Retries will stop, which could cause your operating system to fail. If the OS doesn't fail, then the VM and host will recover when the storage returns for duty.
Ok. Iam planning to increese the APL time limit to 20 minutes just to be sure. So it will keep trying IO till the switch is back, up and running. This should be fine right?
No please don't do this, as this will result in the hypervisor failing. Please leave the APD setting as is. Simply disable the vSphere HA APD response to disabled!
Ok. Well noted.
Instructions from this url would be ok to follow for this I guess?
//docs.vmware.com/en/VMware-vSphere/6.7/com.vmware.vsphere.avail.doc/GUID-DF3123C3-CE63-4431-A1FE-FD41F3395BCE.html
Thanks Guys. I was able to do the activity successfully. I disabled APD response from web client as vsphere client wasnt showing it. Also disabled Host monitoring. Restart of the switch took around 3 mins. Connectivity was down till then but when switch came back up, Everything went back to normal. Nothing in the Host or VMs were affected.
Thanks alot for your great support :smileygrin:
good to hear!