While I was configuring the vCenter for HA in a stretched cluster I was thinking about DRS rules.
In my opinion you shoud use FT for the witness node of vCenter.
Why ?
Well, you need the witness node to make a failover between the active and passive node possible.
If you place the active node on MER 1 and the passive node in MER 2 (the cluster is tretched over MER 1 and 2).
Were should you place the witness ?
MER 1
- outage of MER 2 ==> vcenter will still be accesible.
- outage of MER 1 ==> no failover of vCenter possible, active node + witness are down.
MER 2
- outage of MER 2 ==> no failover possible, passive node & witness are down.
- outage of MER 1 ==> failover to passive node possible
There is always a situation were the vCenter will be offline.
So why not protect the witness node with FT, and placing the secondary VM in the other MER ?
The only config issue is that that witness node has hot plug enabled for CPU and Mem.
After disabling the node wil run FT.
So what do you think of this setup ?
Active node in MER 1
Passive node in MER 2
witness node in MER 1 protected by FT, where secondary is placed in MER 2)
Would that be a valid setup for vCenter HA in a stretched cluster ?
Since this setup or possibility is not mentioned anywhere in the vCenter High Availability documentation, it is unlikely to be supported by VMware (for whatever reason).
However, it is recommended to run the Witness Node in a third site so that a failover is possible at any time. This can also be outside of the stretched cluster. Only a stable network connection over the HA network with less than 10ms latency is required.
Since this setup or possibility is not mentioned anywhere in the vCenter High Availability documentation, it is unlikely to be supported by VMware (for whatever reason).
However, it is recommended to run the Witness Node in a third site so that a failover is possible at any time. This can also be outside of the stretched cluster. Only a stable network connection over the HA network with less than 10ms latency is required.
I agree, a third site would be a better solution. But in this case, it isn't an option.
I'll check it also with VMware support.
It would be a nice solutions for protection of witness VMs....
So, I debated this with some colleagues.
Conclusion, don't use FT.
Why ?
Well a split-brain scenario is possible when the two sites are disconnected....
And that is something you don't want.
So what are the alternatives ?
1.
As mentioned, a third site for placement of witness VMs (for vsan, vCenter HA, SQL etc...)
2.
Keep all vCenter nodes on one site. On site failure the VMs will be restarted by HA on the other side.
This would mean an interruption of vCenter service for max of 15 min.
Scenario 2 is the one we choose.... with DRS rules keeping the VMs on one site.