VMware Cloud Community
insearchof
Expert
Expert
Jump to solution

Host is in an enabled VSAN cluster but does not haveVSAN Service enabled.

HI

New to VSAN

Add SSD drive to my ESXI Host 6.5

I have 5 Hosts on my vCenter.  All have the same message 

Host cannot communicate with one or more other nodes in the vsan enabled cluster

On one Host I have this message

Host is in a VSAN enabled cluster but does not have VSAN service enabled

I found this article

https://docs.vmware.com/en/VMware-vSphere/6.5/com.vmware.vsphere.virtualsan.doc/GUID-CD12983A-4648-4...

Extracted below      I added VSAN as an enabled service on the VMkernal adapter on each host including this one.

This host has 32 GB of Physical Ram

Any other ideas?

     

Host is in a VSAN enabled cluster but does not have VSAN service enabled

Verify whether vSAN network is properly configured and enabled on the host. See Configuring vSAN Network.

Add memory to the host. If you are using a nested ESXi VM, shutdown the VM and increase its memory.

Reply
0 Kudos
1 Solution

Accepted Solutions
TheBobkin
Champion
Champion
Jump to solution

Hello

You still haven't indicated if you are even using Jumbo frames (9000 MTU) which I did ask twice, so the above isn't actually an indication of whether you have an issue with MTU mismatch.

Check what MTU is actually configured on the problem node(s) with:

# esxcfg-nics -l

# esxcfg-vmknic -l

# esxcfg-vswitch -l

If it is 1500 throughout then you are not using Jumbo frames and should be testing connectivity (e.g. from node TGCSESXI-9) using:

# vmkping -I vmk2 -s 1472 -d 10.2.8.62

Bob

View solution in original post

Reply
0 Kudos
9 Replies
MJMSRI
Enthusiast
Enthusiast
Jump to solution

Hi All hosts will need to have a VMkernel port configured that has the vSAN Service Enabled. Sounds like you have done this. Next best steps are:

  • Set a static ip address and subnet for all hosts vmkernel ports that is on the same L2 network.
  • check the VLAN ID on the virtual Switch and ensure the VLAN is set the same
  • Check the physical switches to ensure the connections into these are all enabled, not shutdown and have the same access port VLAN ID Set
  • If the above is all in place, then open Putty to a host and VMKPING to one of the other Hosts vSAN VMKernel port to test if there is connectivity: vmkping -I VMK1 10.10.10.10 -s 9000 -c 100 (change the VMK to the VMKernel port that vSAN is assigned. VMK0 will be for management. then specify the destination hosts vSAN Static IP)
Reply
0 Kudos
insearchof
Expert
Expert
Jump to solution

I do not have any VLANs setup

All VMKERNALS have static ip addresses and all are on the same subnet

All Physical network adapters are active and working.

I ran vmkping to all hosts on the cluster and as you can see I get valid responses.

[root@TGCSESXI-4:~] vmkping -I vmk3 10.2.8.84 -s 9000 -c 100
PING 10.2.8.84 (10.2.8.84): 9000 data bytes
9008 bytes from 10.2.8.84: icmp_seq=0 ttl=64 time=0.136 ms
9008 bytes from 10.2.8.84: icmp_seq=1 ttl=64 time=0.093 ms
9008 bytes from 10.2.8.84: icmp_seq=2 ttl=64 time=0.092 ms
9008 bytes from 10.2.8.84: icmp_seq=3 ttl=64 time=0.093 ms
9008 bytes from 10.2.8.84: icmp_seq=4 ttl=64 time=0.151 ms
9008 bytes from 10.2.8.84: icmp_seq=5 ttl=64 time=0.106 ms

--- 10.2.8.84 ping statistics ---
6 packets transmitted, 6 packets received, 0% packet loss
round-trip min/avg/max = 0.092/0.112/0.151 ms
[root@TGCSESXI-4:~] vmkping -I vmk2 10.2.8.38 -s 9000 -c 100
PING 10.2.8.38 (10.2.8.38): 9000 data bytes
9008 bytes from 10.2.8.38: icmp_seq=1 ttl=64 time=0.500 ms
9008 bytes from 10.2.8.38: icmp_seq=2 ttl=64 time=0.777 ms
9008 bytes from 10.2.8.38: icmp_seq=3 ttl=64 time=0.482 ms
9008 bytes from 10.2.8.38: icmp_seq=4 ttl=64 time=0.498 ms
9008 bytes from 10.2.8.38: icmp_seq=5 ttl=64 time=0.499 ms

--- 10.2.8.38 ping statistics ---
6 packets transmitted, 5 packets received, 16% packet loss
round-trip min/avg/max = 0.482/0.551/0.777 ms
[root@TGCSESXI-4:~] vmkping -I vmk2 10.2.8.62 -s 9000 -c 100
PING 10.2.8.62 (10.2.8.62): 9000 data bytes
9008 bytes from 10.2.8.62: icmp_seq=1 ttl=64 time=0.452 ms
9008 bytes from 10.2.8.62: icmp_seq=2 ttl=64 time=0.512 ms
9008 bytes from 10.2.8.62: icmp_seq=3 ttl=64 time=0.578 ms
9008 bytes from 10.2.8.62: icmp_seq=4 ttl=64 time=0.471 ms
9008 bytes from 10.2.8.62: icmp_seq=5 ttl=64 time=0.447 ms
9008 bytes from 10.2.8.62: icmp_seq=6 ttl=64 time=0.515 ms
9008 bytes from 10.2.8.62: icmp_seq=7 ttl=64 time=0.658 ms

--- 10.2.8.62 ping statistics ---
8 packets transmitted, 7 packets received, 12% packet loss
round-trip min/avg/max = 0.447/0.519/0.658 ms
[root@TGCSESXI-4:~] vmkping -I vmk3 10.2.8.67 -s 9000 -c 100
PING 10.2.8.67 (10.2.8.67): 9000 data bytes
9008 bytes from 10.2.8.67: icmp_seq=0 ttl=64 time=0.520 ms
9008 bytes from 10.2.8.67: icmp_seq=1 ttl=64 time=0.541 ms
9008 bytes from 10.2.8.67: icmp_seq=2 ttl=64 time=0.544 ms
9008 bytes from 10.2.8.67: icmp_seq=3 ttl=64 time=0.548 ms
9008 bytes from 10.2.8.67: icmp_seq=4 ttl=64 time=0.508 ms
9008 bytes from 10.2.8.67: icmp_seq=5 ttl=64 time=0.515 ms
9008 bytes from 10.2.8.67: icmp_seq=6 ttl=64 time=0.508 ms

--- 10.2.8.67 ping statistics ---
7 packets transmitted, 7 packets received, 0% packet loss
round-trip min/avg/max = 0.508/0.526/0.548 ms
[root@TGCSESXI-4:~] vmkping -I vmk2 10.2.8.47 -s 9000 -c 100
PING 10.2.8.47 (10.2.8.47): 9000 data bytes
9008 bytes from 10.2.8.47: icmp_seq=1 ttl=64 time=0.524 ms
9008 bytes from 10.2.8.47: icmp_seq=2 ttl=64 time=0.518 ms
9008 bytes from 10.2.8.47: icmp_seq=3 ttl=64 time=0.520 ms
9008 bytes from 10.2.8.47: icmp_seq=4 ttl=64 time=0.517 ms
9008 bytes from 10.2.8.47: icmp_seq=5 ttl=64 time=0.461 ms
9008 bytes from 10.2.8.47: icmp_seq=6 ttl=64 time=0.521 ms

--- 10.2.8.47 ping statistics ---
7 packets transmitted, 6 packets received, 14% packet loss
round-trip min/avg/max = 0.461/0.510/0.524 ms
[root@TGCSESXI-4:~]

Any thing else to check?

Reply
0 Kudos
insearchof
Expert
Expert
Jump to solution

I got this from another question hope this helps us figure this out.

[root@TGCSESXI-4:~] vmkping -I vmk2 -s 8972 -d 10.2.8.62
PING 10.2.8.62 (10.2.8.62): 8972 data bytes
sendto() failed (Message too long)
sendto() failed (Message too long)
sendto() failed (Message too long)

--- 10.2.8.62 ping statistics ---
3 packets transmitted, 0 packets received, 100% packet loss
[root@TGCSESXI-4:~]

???

I tried on another host with same results

[root@TGCSESXI-9:~] vmkping -I vmk2 -s 8972 -d 10.2.8.62
PING 10.2.8.62 (10.2.8.62): 8972 data bytes
sendto() failed (Message too long)
sendto() failed (Message too long)
sendto() failed (Message too long)

--- 10.2.8.62 ping statistics ---
3 packets transmitted, 0 packets received, 100% packet loss
[root@TGCSESXI-9:~]

Reply
0 Kudos
insearchof
Expert
Expert
Jump to solution

Still looking for help on this

Reply
0 Kudos
TheBobkin
Champion
Champion
Jump to solution

Hello

You still haven't indicated if you are even using Jumbo frames (9000 MTU) which I did ask twice, so the above isn't actually an indication of whether you have an issue with MTU mismatch.

Check what MTU is actually configured on the problem node(s) with:

# esxcfg-nics -l

# esxcfg-vmknic -l

# esxcfg-vswitch -l

If it is 1500 throughout then you are not using Jumbo frames and should be testing connectivity (e.g. from node TGCSESXI-9) using:

# vmkping -I vmk2 -s 1472 -d 10.2.8.62

Bob

Reply
0 Kudos
insearchof
Expert
Expert
Jump to solution

[root@TGCSESXI-4:~] esxcfg-nics -l
Name    PCI          Driver      Link Speed      Duplex MAC Address       MTU    Description
vmnic0  0000:02:00.0 bnx2        Up   1000Mbps   Full   d4:ae:52:77:6c:b5 9000   QLogic Corporation QLogic NetXtreme II BCM5716 1000Base-T
vmnic1  0000:02:00.1 bnx2        Up   1000Mbps   Full   d4:ae:52:77:6c:b6 9000   QLogic Corporation QLogic NetXtreme II BCM5716 1000Base-T
vmnic2  0000:04:00.0 bnx2        Up   1000Mbps   Full   00:26:55:87:67:dc 9000   QLogic Corporation NC382T PCI Express Dual Port Multifunction Gigabit Server Adapter
vmnic3  0000:04:00.1 bnx2        Up   1000Mbps   Full   00:26:55:87:67:de 9000   QLogic Corporation NC382T PCI Express Dual Port Multifunction Gigabit Server Adapter
[root@TGCSESXI-4:~] esxcfg-vmknic -l
Interface  Port Group/DVPort/Opaque Network        IP Family IP Address                              Netmask         Broadcast       MAC Address       MTU     TSO MSS   Enabled Type                NetStack
vmk0       Management Network                      IPv4      10.2.8.75                               255.255.252.0   10.2.11.255     d4:ae:52:77:6c:b5 9000    65535     true    STATIC              defaultTcpipStack
vmk0       Management Network                      IPv6      fe80::d6ae:52ff:fe77:6cb5               64                              d4:ae:52:77:6c:b5 9000    65535     true    STATIC, PREFERRED   defaultTcpipStack
vmk1       VMkernel                                IPv4      10.2.8.73                               255.255.252.0   10.2.11.255     00:50:56:6a:33:1a 9000    65535     true    STATIC              defaultTcpipStack
vmk1       VMkernel                                IPv6      fe80::250:56ff:fe6a:331a                64                              00:50:56:6a:33:1a 9000    65535     true    STATIC, PREFERRED   defaultTcpipStack
vmk2       Vmotion                                 IPv4      10.2.8.92                               255.255.252.0   10.2.11.255     00:50:56:63:b4:8d 9000    65535     true    STATIC              defaultTcpipStack
vmk2       Vmotion                                 IPv6      fe80::250:56ff:fe63:b48d                64                              00:50:56:63:b4:8d 9000    65535     true    STATIC, PREFERRED   defaultTcpipStack
vmk3       Black Armor                             IPv4      10.2.8.84                               255.255.252.0   10.2.11.255     00:50:56:66:bb:ab 9000    65535     true    STATIC              defaultTcpipStack
vmk3       Black Armor                             IPv6      fe80::250:56ff:fe66:bbab                64                              00:50:56:66:bb:ab 9000    65535     true    STATIC, PREFERRED   defaultTcpipStack
[root@TGCSESXI-4:~] esxcfg-vswitch -l
Switch Name      Num Ports   Used Ports  Configured Ports  MTU     Uplinks
vSwitch0         1792        12          128               9000    vmnic0,vmnic1

  PortGroup Name        VLAN ID  Used Ports  Uplinks
  VM Network            0        3           vmnic0,vmnic1
  Black Armor           0        1           vmnic0,vmnic1
  Vmotion               0        1           vmnic0,vmnic1
  VMkernel              0        1           vmnic0,vmnic1
  Management Network    0        1           vmnic0,vmnic1

Switch Name      Num Ports   Used Ports  Configured Ports  MTU     Uplinks
vSwitch1         1792        4           128               9000    vmnic2

  PortGroup Name        VLAN ID  Used Ports  Uplinks
  VMkernal-iSCSI-1      0        1           vmnic2

Switch Name      Num Ports   Used Ports  Configured Ports  MTU     Uplinks
vSwitch2         1792        3           128               9000    vmnic3

  PortGroup Name        VLAN ID  Used Ports  Uplinks
  VMkernal-iSCSI-2      0        0           vmnic3

[root@TGCSESXI-4:~]

I just changed all vswitchs and vmkernal adapters to MTU 9000 as you can see

The ping

[root@TGCSESXI-9:~] vmkping -I vmk2 -s 1472 -d 10.2.8.62
PING 10.2.8.62 (10.2.8.62): 1472 data bytes
1480 bytes from 10.2.8.62: icmp_seq=0 ttl=64 time=0.312 ms
1480 bytes from 10.2.8.62: icmp_seq=1 ttl=64 time=0.249 ms
1480 bytes from 10.2.8.62: icmp_seq=2 ttl=64 time=0.344 ms

--- 10.2.8.62 ping statistics ---
3 packets transmitted, 3 packets received, 0% packet loss
round-trip min/avg/max = 0.249/0.302/0.344 ms
[root@TGCSESXI-9:~]

Hosts TGCSESXI-4 still shows error after changing MTU to 9000 and I also changed the MTU to 9000 to all hosts in the cluster.

Reply
0 Kudos
insearchof
Expert
Expert
Jump to solution

updated my cisco 3750 switch

ran this command on the cisco  

config

system mtu jumbo 9000

exit

reload

So now the external network device has Jumbo

Still got the warning on the host

Reply
0 Kudos
insearchof
Expert
Expert
Jump to solution

I posted that I changed all MTU 's to 9000

Why no answer.

Reply
0 Kudos
depping
Leadership
Leadership
Jump to solution

I posted that I changed all MTU 's to 9000

Why no answer.

Just to remind you, there's no SLA on responses to community forum threads. Most people on here do this to help others, outside of the direct responsibilities of their job. So please be patient,

Reply
0 Kudos