VMware Cloud Community
basteku73
Enthusiast
Enthusiast
Jump to solution

HA in vSAN - vSphere HA agent on host cannot reach some

Hi,

I have a strange problem, I cannot find whats the reason.

Its a vSAN Streched Cluster, 6 nodes. Almost eevery host reports:

vSphere HA agent on host cannot reach some management network addresses of other hosts: IP_addresses_from_another_DC

So, hosts in DC1 reports connectivity problems with hosts on DC2. VMKPING works ok, I mean that pings come back. I got response from the addresses from anotherDC.

Also I noticed in fdm.log:

verbose fdm[3001144] [Originator@6876 sub=Cluster] ICMP data length 56 smaller than sizeof(ClusterPingData) 64

Maybe thats the problem.

Regards,

Sebastian

Reply
0 Kudos
1 Solution

Accepted Solutions
basteku73
Enthusiast
Enthusiast
Jump to solution

The problem was solved, but maybe because I did not inform that the nodes of the stretched cluster were in different subnets, it was enough to configure the routing

View solution in original post

Reply
0 Kudos
6 Replies
basteku73
Enthusiast
Enthusiast
Jump to solution

additional info, in fdm logs I got:

info fdm[3005210] [Originator@6876 sub=Monitor] No ping reply from x.x.x.1

info fdm[3005210] [Originator@6876 sub=Monitor] No ping reply from x.x.x.2

info fdm[3005210] [Originator@6876 sub=Monitor] No ping reply from x.x.x.3

But when I type in esxcli:

vmkping -I vmk2 (vSAN Network) x.x.x.1, I got:

PING x.x.x.1 (x.x.x.1): 56 data bytes
64 bytes from x.x.x.1: icmp_seq=0 ttl=62 time=0.535 ms
64 bytes from x.x.x.1: icmp_seq=1 ttl=62 time=0.578 ms
64 bytes from x.x.x.1: icmp_seq=2 ttl=62 time=0.610 ms

 

Reply
0 Kudos
mike-p
Enthusiast
Enthusiast
Jump to solution

Hi, i would expect that HA traffic ist distributed via vmk0 with your management network.

Reply
0 Kudos
basteku73
Enthusiast
Enthusiast
Jump to solution

Hi, Thank You for the response, but if vSAN is enabled, the vSphere HA goes via vSAN Network: 

basteku73_1-1633938638783.png

Regards, 

Sebastian

Reply
0 Kudos
depping
Leadership
Leadership
Jump to solution

are you using a different MTU size?

Reply
0 Kudos
basteku73
Enthusiast
Enthusiast
Jump to solution

no 🙂 

Everywhere is 9000 MTU .. and ping with option "jumbo frames" also works ... maybe one more thing, this is configured with L3 connection between sites. 

 

Regards, 

Sebastian

Reply
0 Kudos
basteku73
Enthusiast
Enthusiast
Jump to solution

The problem was solved, but maybe because I did not inform that the nodes of the stretched cluster were in different subnets, it was enough to configure the routing

Reply
0 Kudos