VMware Cloud Community
ChicaneUK
Enthusiast
Enthusiast

Graceful failover to mirror array without SRM?

Hello all.

Today we were doing some testing on a new cluster we're putting in. The storage is a 3Par T800, which is mirrored to a 3Par F400. We wanted to simulate proper disaster conditions, of an array failing and us having to mount up our mirror volumes. So for step one, we simply unpresented the test LUNS from the 3Par, and kicked off a rescan on the hosts.

Now I'm well aware of the known bug about how well vSphere handles APD scenario's, and obviously if we were doing this in a controlled way, we would evacuate the LUNs, remove VMFS and mask them... but of course this was a disaster scenario test where we wouldn't necessarily get this luxury.

Three of the four hosts in the cluster took 15-20 minutes of scanning before they finally gave up and cleared the dead paths but one host was still going after an hour and became inaccessible. It was endlessly logging to the vmkernel log about APD. This was the host on which we still had a VM registered on one of the LUNs which had now been removed.

So - my question is, should I set on all my hosts the FailVolumeOpenIfAPD option to "1" to prevent this problem occuring again? Would this be best practice in an environment where we have a mirror array, and would want to fail over to mirror volumes in the event of a disaster? Or, would we typically only enable this option in the event of a disaster and leave it disabled at all other times?

We've got a ticket open with VMware support to see what they would advise but would be interested to hear of any experiences or best practice you may have on this. I've done a load of reading around and can't find any advice on which environments you should have the FailVolumeOpenIfAPD option enabled.

0 Kudos
0 Replies