VMware Cloud Community
CiaranFoster
Contributor
Contributor

vSAN network confusion

Hi all.

Am proper frustrated trying to setup a vSAN 2-node cluster.
Googling for ages...watching vids...nothing seems to show me what I need but the answers must be out there!


So the setup is as follows and I hope someone can show pity on me and help out!

Three ESXi 7 hosts on the site.

Two designed for the cluster, VM workload and the shared storage, one for hosting the vSAN Witness Appliance.
The two clustered hosts have identical network setups - 8 NCS spread over three vSwitches:
1) MGMT/VMs using first four NICs;
2) vMotion using two NICs;
3) vSAN using two NICs.

The Host hosting the Witness appliance has 4 NICS and two vSwitches:
1) MGMT/VM network using two NICs;
2) vSAN using two NICs.

So I deployed the Witness VM on the latter host specced for it.
It deploys with one vSwitch which I assign to the "VM Network" on that host.  I add that to Vcenter (not into the cluster) and all seems OK.

I try config the vSAN and the cluster wizard tells me the Witness has no vSAN traffic on any network so I manually add a vSwitch using vmnic1 on the Witness and assign it an IP in the vSAN network range and tag it for vSAN...and now the Wizard completes.

Here's my first big question - how do the vmnics on the Witness Appliance map to the physical NICs on the host that Appliance is runing on?
This is is really catching me as I can't visualise how the traffic flows. I can't see how the vSwitch I created on the Witness Appliance for vSAN is flowing thru the actual NICS on the correct vLAN on the host?  The Witness Appliance only has two vmnics whereas the host it resides on has four and they are setup appropriately (I can trace the four cables into switchpoirts configured for either Management or vSAN vLANs).  I feel this is where I am falling over as I can't get my head around how these correlate.

The complete the Configure setup and it sems to go OK (I can build a VM on the vSAN) but I have multiple errors in the Skyline Health, including 3 network issues (vSAN cluster Partition, vSAN unicast connectivity & vSAN MTU check) and also a data error (vSAN object health) so I'm not filled with confidence with the setup as is!

 

Apologies if I have thrown out a lot of info, or have missed some basics.
I'm getting close to abandoning the whole setup and it's frustrating as I have read loads of articles and looked at vids online but nobody seems to cover the basic networking of the host hosting the Witness and how it relates to the Witness Appliance's networking itself.

Cheers for reading this far πŸ™‚

Labels (1)
Reply
0 Kudos
7 Replies
depping
Leadership
Leadership

it sounds like, as you have indicated, you are missing some of the basics. The fact you have a network partition listed in the health check indicates that you have incorrectly configured the vSAN network. I don't know which doc's you followed, but we have some very extensive guides here: https://core.vmware.com/resource/vsan-2-node-cluster-guide

Now, a couple of things:

The host that is hosting the witness, it is a stand alone host? If so, then you don't need vSAN enabled on that host at all. The vSAN Witness Appliance will use the "VM Network" portgroup to communicate with the 2-node cluster setup. In order for the Witness to Communicate you will need to make sure that a VMkernel interface is tagged for vSAN traffic, which it seems you have configured right now.

What is easiest to do next is to look at the 2-node configuration. Verify that each of them have a VMkernel interface with vSAN enabled on it and an IP assigned etc.

Then what I typically do is I go to the command line on one of the hosts and I use "vmkping" to try to ping the other host and try to ping the witness, you can use "vmkping -I" to specify which VMKernel interface should be used for testing the network connectivity. (more details here: https://kb.vmware.com/s/article/1003728

It is very likely that at the moment you cannot ping between hosts, so that is the first part you need to resolve. probably easiest if you try that first and then report back here.

 

PS: I also moved this topic over to the vSAN forum

TheBobkin
Champion
Champion

@CiaranFoster, While you should of course validate can you ping at all between the vsan-tagged vmks on the data-nodes and Witness, this should also be done in such a manner that you validate what MTU is being used and what passes without fragmentation and over which vmk (e.g. ensure you are testing connectivity on the vsan-network not Management or vMotion).

vSAN cluster membership negotiation (e.g. join cluster) will fail if all cluster members cannot use the same MTU when mixed (e.g. if you have data-nodes at 9000 MTU and Witness at 1500 MTU then it will not join), mentioning this as you mentioned 'vSAN MTU check' was triggered.

 

If you are not sure how/what you should be checking here, feel free to add the following information from both data-nodes + Witness and I can provide the commands to run:
# esxcli vsan network list
# esxcfg-vmknic -l

CiaranFoster
Contributor
Contributor

Hi there and thanks for the information you provided.
It was helpful!

So I am in the position now where I'm going back to scratch - I've deleted the cluster and the witness appliance and I am back to an environment where I have three ESXi hosts.  Two have the requiste specs (disk, CPU & RAM) to host the VMs while the 3rd is a smaller box intended to run just the Witness.
All three have VMware ESXi, 7.0.3, 21313628 installed as of right now.

So first question - should I use that 3rd ESXi host as a witness as is, or install the Witness appliance on it (which is what I was doing and encountering issues and confusion).  It seems to me that it would be easier to simply use it's ESXi install as a witness itself and foget the whole idea of the appliance but I am not sure if that is feasable.
Maybe I am constrained by my license which is:
"vSphere 7 for Virtual SAN Witness for Embedded OEMs" which has a 2 CPU capacity and then I also have 2 CPU capacity of "vSAN Standard" licenses to use (each to-be clustered host has a single CPU).

I am unable to apply that vSAN witness to the ESXi host as it says:
"This is an embedded OEM license. You cannot assign it to non embedded OEM assets" so I assume I need to setup a witness appliance?


If this can be clarified, then I'll get a new appliance setup so lets get that out of the way first!

Cheers.

 

Reply
0 Kudos
depping
Leadership
Leadership

If you use a physical host as an appliance then the host will need to be properly licensed as well, so you cannot use that OEM license (embedded for the Appliance) unfortunately. Also, the components in that host would need to be certified for vSAN itself. While when you run the appliance, the components will only need to be certified for vSphere. This is why most people use the appliance.

Reply
0 Kudos
CiaranFoster
Contributor
Contributor

Cheers.
The standalone host is indeed licensed with a vSphere 7 Standard license.
Will that allow it to be used as a Witness node?

Seperatly, I am reading the docs and have confusion around "witness" trafic.
The docs say:
"In the illustration below, each vSAN Host's vmk0 VMkernel interface is tagged with both "Management" and "witness" traffic. The vSAN Witness Host has the vmk0 VMkernel interface tagged with both "Management" and "vsan" traffic. This is also a supported configuration."
Now, I can see how to enable the "Management" or "vSAN" services on a VMK, but I have yet to see a "Witness" service so how can that VMK be tagged for "witness" traffic?

 

Reply
0 Kudos
TheBobkin
Champion
Champion

@CiaranFoster Yes, you can run a Witness VM on ESXi licensed with literally any version (even ESXi free edition), this has no bearing on license edition of the vSAN cluster.

 

The Witness host should only be tagged for vsan-traffic - if you want to split out vsan-traffic and witness-traffic on the data-nodes then the witness-traffic tag is added via CLI e.g.:

# esxcli vsan network ip add -i vmkx -T witness

Reply
0 Kudos
CiaranFoster
Contributor
Contributor

OK, good info, cheers.

Apologies if I am off the mark here but ket me try describe what I need in my own words...
So say I don't want to split out that traffic, am I right in saying the Witness host will just have one VMK, with both MGMT and vSAN traffic on it (no Witness traffic at all on this host)?
So the physical NIC(s) need to be attached to a trunk port on the switch as there will be two separate vLANs on each NIC attached to that vSwitch?

And then on the Data hosts, I can use VMK0 for MGMT and Witness traffic while I use, say, VMK1 for vSAN traffic exclusively (and can put VMs and vMotion on different portgroup?

These are all in the same location so attached to the same switch-stack, FYI.

 

 

Reply
0 Kudos