VMware Cloud Community
anothervsphereu
Enthusiast
Enthusiast

Issue adding additional nodes to vSAN cluster

I recently built a 5 node vSAN cluster using a Dell PowerEdgeMX7000 chassis and M740c sleds with 1 SSD and 2 SAS drives each.   The chassis has 1 pair of M5108 ethernet switches that uplink to the main core.  vSAN traffic does not leave the chassis switches.

The vSAN cluster with the 5 nodes is healthy and all skyline health checks pass.  i am currently running Horizon View on this cluster.

I am running esxi 6.7u3+ on these nodes as well.

I am trying to add 3 more vSAN nodes.  they are configured identically to the existing 5 nodes.  Same model, disk config, network, cpu/mem etc.   I am strictly using a vSphere distributed switch with MTU of 9000 configured on the vSAN vmk port group. 

All physical switch ports in the chassis are configured identically, each has an MTU of 9216, flow control is all the same,etc.

When i attempted to add one new node, the health check immediately flagged the vSAN : Basic unicast connectivity check and vSAN : MTU check (ping with large package size) tests on vmk2 which is the vSAN vmk.   I checked using ping tests that connectivity between all the nodes is good.   I can ping with jumbo frames, although i cannot ping with a packet size of 9000 with no packet fragmentation, but I can't do that on the original 5 nodes that still show ok.   As soon as i put the host back in maintenance mode, the health status goes back to healthy.

Then i tired adding a different node and over the course of 3 hours or so, i retested the health several times and it always showed green.  Sometime overnight, something happened with that node that caused all kinds of vSAN issues.  Horizon desktops would not power on, existing sessions were kicked off.  The vSAN cluster status showed partitioned.  I had to forcefully power off that new node, then power it back on and do a full data migration and remove it from the cluster to get the cluster healthy again.   As before, with the 5 original nodes, it shows healthy.

What am i missing here?   If the 3 new nodes are on the same ethernet switches, same config, etc. why am I having this problem? 

I'm lost on this one.

thanks

Reply
0 Kudos
3 Replies
TheBobkin
Champion
Champion

Hello anothervsphereuser​,

"I can ping with jumbo frames, although i cannot ping with a packet size of 9000 with no packet fragmentation"

Sorry but these are 2 conflicting statements - if you can't ping without fragmentation then it is not configured correctly or capable of passing the intended frame size without fragmenting it.

To be technical about it, (assuming ipv4 not ipv6 in use) you should be able to ping between vmk interfaces with 9000 minus 28 byte overhead e.g.:

# vmkping -I vmkX -s 8972 -d <dest IP>

Ensure you are testing with vmkping and not just ping and testing between the actual interfaces relating to the issue at hand (e.g. it is not very useful for vSAN if all the hosts can vmkping just fine over management vmk).

Something that maybe is not so well publicised regarding vSAN is that we send cluster membership update packets with No-Frag (e.g. -d) set but not necessarily normal data packets - what this means is that one can potentially set up a cluster on one MTU, subsequently lower the MTU and think everything is fine (it is not, bad idea) as the cluster stays clustered. Then if Master changes or membership needs to be re-negotiated, you may face cluster-partition issues due to the infrastructure not supporting the MTU set on the host-side.

So, please check what the max MTU that can currently pass with NF/-d - if you are not getting 8972 then if possible, raise this accordingly on the switches - if it is not possible to raise this further then set this end-to-end on the hosts so that the 'advertised' MTU is able to pass with -d (plus 28 bytes).

Note that making any changes to MTU on the fly on a production cluster during business hours is a bad idea - if you are planning on/need to make changes to the configured MTU, do this at appropriate hours/days of the week.

If from further validation the above is irrelevant/wrong direction, please don't be shy and give us folks at vSAN GSS a call.

Bob

Edit: bytes to plus, not minus for what need to be set on host side.

Reply
0 Kudos
anothervsphereu
Enthusiast
Enthusiast

Hey Bob.  thanks for the reply

I guess I was not clear on the ping stuff, so let me clarify.

On the physical switch ports, i have an MTU of 9216 set.  The vlan I am using for vSAN also has an MTU of 9216.  The dvSwitch is configured for an MTU of 9000, so my vSAN vmk port is inheriting that MTU.   I am able to vmkping with and packet size of 8972 with no fragmentation.  When I mentioned I could not ping with an MTU of 9000 with no fragmentation, that would be like this.  vmkping -I vmk2 -s 9000 -d <ip>.  Obviously, since the MTU on the switch is 9000 this will fail.  But I mentioned this because I saw exactly that command in the VMware docs i was using to troubleshoot.

the only vmk port that is using an MTU of 1500 is vmk0 which is management.  All host are like that becuase i migrated vmk0 from the default vswitch0.

I will check all hosts again with a packet size of 8972 to be sure and post again.  I'm not to the point of opening a ticket yet...but very close.

thanks

Reply
0 Kudos
anothervsphereu
Enthusiast
Enthusiast

I can ping with this command from one of my new nodes to all other nodes, successfully with no drops and an RTT of between .05 and .2ms

vmkping -I vmk2 -4 <ip_address> -s 8972 -d

Reply
0 Kudos