VMware Cloud Community
therealhostman
Contributor
Contributor

Stretched cluster failure scenario

Hi,

After some clarification pls regarding a stretched cluster and a specific failure scenario.  The 10Gb link between two clusters used for synchronous replication, if this goes dark for a period of time, what is the outcome?  Some documentation I've read says VMs will move from secondary to preferred cluster (if running active/active for example).  Is it not possible for VMs to continue running active/active, albeit in a state where synchronous replication will not restart until the 10Gb link is back online?

Thanks.

Reply
0 Kudos
4 Replies
TheBobkin
Champion
Champion

In a standard stretched cluster configuration (RAID1,FTT1=1 across sites), if the inter-site connection is broken VMs will fail over to whichever site is configured as Preferred (this can of course be changed should that site be down/impaired).

VMs won't run on both side simultaneously - that wouldn't make sense as then it would be split-brained and which set of data would you use following the outage?

Following re-establishing connection between the sites, the delta data from the Preferred site is synced to the other site.

More information regarding failure scenarios and required HA settings etc. can be found here:

https://storagehub.vmware.com/t/vmware-vsan/vsan-stretched-cluster-guide/

Bob

Reply
0 Kudos
depping
Leadership
Leadership

And we (VMware vSAN product team) knows this is a problem, and we are looking to fix this in the future. The reason you end up in this situation today is because the Witness VM binds itself to 1 location. Which means the other location will lose quorum and as such all VMs which are stretched will lose access to their storage objects, and those VMs will be killed by vSAN automatically.

Again, this is a known concern, and the team has it listed as an issue we need to solve in the future. I can't comment unfortunately when this will be,

Reply
0 Kudos
therealhostman
Contributor
Contributor

Provided communications with the witness is still live during a scenario where the replication link has failed, I would have thought VMs could remain running 50/50, with of course the ability to fail across clusters disabled until the replication is re-enabled and data resynced.  I understand the concept of a split brain, but this is what the witness server is essentially supposed to prevent.

From what depping is saying, this is how the vSAN development guys want it to work, but it needs development work to facilitate it?

Reply
0 Kudos
depping
Leadership
Leadership

The problem is that the Witness Appliance is not a witness, but it hosts witness objects. A host can only be part of 1 cluster or partition in this case, so the witness host will bind itself to the preferred location. Which causes the secondary location to lose quorum.

Yes this needs development work, and is being looked at.