The Stretched Cluster guide is a good start.
Isolation Address look ok.
I would send the witness through the primary site as that defeats the purpose in case of a network outage. You can use L3 for witness as well as WTS (Witness Traffic Separation) to send witness traffic using other network. Witness traffic is not very heavy, just metadata and the requirements are low IMO.
I think you meant you wouldn't send it through the primary site. I agree, but in my case the gateway used to send witness traffic to the witness site is part of an HSRP group which means that the gateway will (should) always be available even after the primary site fails.
I have been reading about implementing WTS since it is available on the version I am running, but have a couple of things I want to confirm first.
If I understand it correctly I would add another vmkernel to each host tagged for witness traffic (I could use vmk0 but rather not).
Site A Hosts vmkWitness 10.10.3.0/24 (vlan 103) , gw 10.10.3.1 only located on Site A
Site B Hosts vmkWitness 10.10.4.0/24 (vlan 104) , gw 10.10.4.1 only located on Site B
and then add static routes to each host as follow:
- Site A hosts:
esxcli network ip route ipv4 -n 10.10.1.0/24 -g 10.10.3.1
- Site B Hosts:
esxcli network ip route ipv4 -n 10.10.1.0/24 -g 10.10.4.1
- Witness Host
To reach Site A hosts:
esxcli network ip route ipv4 -n 10.10.3.0/24 -g 10.10.1.1
To reach Site B hosts:
esxcli network ip route ipv4 -n 10.10.4.0/24 -g 10.10.1.1
10.10.1.1 is the witness host subnet gateway located on the witness site.
Something very similar to this
Your example is using the same gateway for the vSAN Witness Host to communicate with either site.
Routing to the vSAN Witness Host is required to be done per site.
It is important that the sites communicate with the vSAN Witness Host independently.
If the HSRP gateway is only available in Site A, but Site A has been isolated, then vSAN won't be able to have more than a single site contributing, resulting in inaccessible data.
- Site A isolated
- Site B & vSAN Witness Host can't communicate because neither can access the gateway, which resides in the isolated site.
- Data inaccessible
An HSRP gateway that resides in a single site isn't going to allow for proper failover.
Alternatively, if Site A and Site B can communicate with the vSAN Witness Host independently,
- Site A isolated
- Site B & vSAN Witness Host can communicate
- Data accessible but out of Storage Policy Compliance.