VMware Cloud Community
Mike_Gray
Enthusiast
Enthusiast

vsan help

Hello team,

I am facing intermittent vsan health check failure with esxi 6.5. we have 3 nodes on cluster and L2 connectivity between them with lag vsan vmk are not able to reach each other intermittently. We can see vmkping between them. Can someone help me on this

Reply
0 Kudos
4 Replies
TheBobkin
Champion
Champion

Hello Mike,

Specifically which Health alert is being triggered? (This one perhaps? kb.vmware.com/kb/2108011)

Are specific nodes to nodes triggering the alarm consistently or is it all nodes?(drill-down of the Alert should note these)

Is this a stretched-cluster?

Is this intermittant and is it causing a proper cluster partition (VMs become inaccessible/go down)?

If it is the alert in the kb article above then follow the steps to do the recommended checks.

Bob

Reply
0 Kudos
Mike_Gray
Enthusiast
Enthusiast

I test the host reach ability from all other host. and could see that one host not able to get the  arp of vsan vmk other host during this time. what will be the reson

Reply
0 Kudos
TheBobkin
Champion
Champion

Hello Mike,

Okay, just to clarify - Is one host not able to reach just one other host when this occurs or is one host not able to communicate with any other hosts?

If just one host to another single host, is it always the same host-host connection?

If it is just one host that cannot communicate with all the others then check the NIC stats on this host using nicinfo.sh (/usr/lib/vmware/vmware-support/bin/nicinfo.sh) and/or esxcli network stats get, look on the switch for any errors on the associated port if you know what you are looking at.

Either way I would advise taking a closer look at your network configuration for vSAN on the affected host(s) and ensure best practices have been applied and nothing misconfigured.

Bob

Reply
0 Kudos
Mike_Gray
Enthusiast
Enthusiast

Bob,

Its not host specific, the issue occurring with all host in the cluster randomly. Please note the there is no issue with management, issue with vsan/motion  vmkernel ports.

3 nodes with 10G.

LAG with src/dst ip tcp/udp ports load balancing

LAD as primary uplink for the dedicated port group.

Single VMK private ip on each host .

This is the setup with us

Reply
0 Kudos