VMware Cloud Community
burvil
Enthusiast
Enthusiast
Jump to solution

Misconfiguration detected error - multicast problem?

I was seeing a ‘Misconfiguration Detected’ error after logging into the the Web client, selecting a cluster, Manage tab, Settings subtab, Virtual SAN->General.  This went away after enabling VSAN traffic on management vmk0 interface.  I found that, per vSphere 6.0 Documentation Center, "Network Misconfiguration Status in a Virtual SAN Cluster" this was likely due to misconfiguration of multicast.

I talked to my network admin, who made sure IGMP Snooping was disabled, and I turned VSAN off then on again, and the error disappeared temporarily.  However, when choosing a specific host withing a cluster, I still see ‘Host cannot communicate with all other nodes in the Virtual SAN enabled cluster’.  The "Misconfiguration detected" error has come back.

1. Does a multicast group need to be configured for this to work?  I'm referring to vSphere 6.0 Documentation Center, "Multicast Filtering Modes", where it says:

In basic multicast filtering mode, a vSphere Standard Switch or vSphere Distributed Switch forwards multicast traffic for virtual machines according to the destination MAC address of the multicast group. When joining a multicast group, the guest operating system pushes the multicast MAC address of the group down to the network through the switch. The switch saves the mapping between the port and the destination multicast MAC address in a local forwarding table.


    I'm assuming this doesn't apply to the VLAN that I have dedicated for VSAN multicast traffic, where IGMP Snooping should be disabled?



2.  Can’t configure port 4095 to allow all trunked VLANs – is this needed?

I had thought that when configuring the network on the ESXi server, I had thought that I’d have to specify 4095 for the VLAN to allow all trunked 802.1q VLANs through.

As:

- the documentation I found only goes to 4094 and

- in the ESXi console I entered 4095 for the VLAN ID (which failed),

is the VLAN ID of the management network is fine?  Or, could it be that traffic for this VLAN is not even getting passed through to the server because of this?

3. Below is what I understand VMware in terms of this error:

Cluster will not form – nodes won’t be able to communicate – Misconfiguration detected

Option 1 – Disabling IGMP Snooping => allows all multicast traffic through

Option 2 - Configure IGMP Snooping Querier => if there is other multicast traffic and you are concerned that multicast traffic might flood network






I also got this link for what's required from the Cisco side, but it doesn't work (gives 404 error).  Can anyone give instructions that work?

http://www.cisco.com/c/en/us/td/docs/switches/datacenter/sw/nxSos/multicast/configuration/guide/b_mu...






4. In comparing output from two servers, they can't ping each other (the 10.27.98.7* addresses) on vmk1.  My network engineer says the switchports are all configured the same.  Any thoughts on why they can't ping?

[root@host05:~] esxcli network ip interface ipv4 get

Name  IPv4 Address  IPv4 Netmask     IPv4 Broadcast  Address Type  DHCP DNS

----  ------------  ---------------  --------------  ------------  --------

vmk0  10.27.98.199  255.255.255.192  10.27.98.255    STATIC           false

vmk1  10.27.98.71   255.255.255.192  10.27.98.127    STATIC           false

vmk2  10.27.98.136  255.255.255.192  10.27.98.191    STATIC           false

[root@host05:~] vmkping -I vmk1 10.27.98.71

PING 10.27.98.71 (10.27.98.71): 56 data bytes

64 bytes from 10.27.98.71: icmp_seq=0 ttl=64 time=0.096 ms

64 bytes from 10.27.98.71: icmp_seq=1 ttl=64 time=0.084 ms

--- 10.27.98.71 ping statistics ---

2 packets transmitted, 2 packets received, 0% packet loss

round-trip min/avg/max = 0.084/0.090/0.096 ms

[root@host05:~] vmkping -I vmk1 10.27.98.70

PING 10.27.98.70 (10.27.98.70): 56 data bytes

--- 10.27.98.70 ping statistics ---

3 packets transmitted, 0 packets received, 100% packet loss

[root@host05:~] vmkping -I vmk1 10.27.98.71

PING 10.27.98.71 (10.27.98.71): 56 data bytes

64 bytes from 10.27.98.71: icmp_seq=0 ttl=64 time=0.088 ms

64 bytes from 10.27.98.71: icmp_seq=1 ttl=64 time=0.074 ms

64 bytes from 10.27.98.71: icmp_seq=2 ttl=64 time=0.081 ms

--- 10.27.98.71 ping statistics ---

3 packets transmitted, 3 packets received, 0% packet loss

round-trip min/avg/max = 0.074/0.081/0.088 ms

[root@host04:~] esxcli network ip interface ipv4 get

Name  IPv4 Address  IPv4 Netmask     IPv4 Broadcast  Address Type  DHCP DNS

----  ------------  ---------------  --------------  ------------  --------

vmk0  10.27.98.198  255.255.255.192  10.27.98.255    STATIC           false

vmk1  10.27.98.70   255.255.255.192  10.27.98.127    STATIC           false

vmk2  10.27.98.135  255.255.255.192  10.27.98.191    STATIC           false

--- 10.27.98.70 ping statistics ---

3 packets transmitted, 3 packets received, 0% packet loss

round-trip min/avg/max = 0.072/0.080/0.089 ms

[root@host04:~] vmkping -I vmk1 10.27.98.71

PING 10.27.98.71 (10.27.98.71): 56 data bytes

64 bytes from 10.27.98.71: icmp_seq=0 ttl=64 time=0.726 ms

64 bytes from 10.27.98.71: icmp_seq=1 ttl=64 time=0.362 ms

64 bytes from 10.27.98.71: icmp_seq=2 ttl=64 time=0.561 ms

--- 10.27.98.71 ping statistics ---

3 packets transmitted, 3 packets received, 0% packet loss

round-trip min/avg/max = 0.362/0.550/0.726 ms






Reply
0 Kudos
1 Solution

Accepted Solutions
depping
Leadership
Leadership
Jump to solution

PS: it definitely sounds like a network configuration issue

View solution in original post

Reply
0 Kudos
4 Replies
depping
Leadership
Leadership
Jump to solution

Have you installed the Virtual SAN Healthcheck Plugin? It will provide you more insights in the problem you are experiencing.

Reply
0 Kudos
depping
Leadership
Leadership
Jump to solution

PS: it definitely sounds like a network configuration issue

Reply
0 Kudos
burvil
Enthusiast
Enthusiast
Jump to solution

Yeah, that's my thought, too - that it's some kind of misconfiguration for multicast, or maybe some kinda of incompatibility on the switch.  I tried plugging in the IP, netmask and gateway for use as a management address in the server console, and it was fine, i.e. it passed all the network tests (i.e. pinged the gateway, DNS, and resolved the hostname OK).  I'll need to dig up the exact instructions/configuration needed on the switch for multicast, as I've checked with the network admin a couple times, and keep getting the same response - that it is configured correctly.  But yes, I agree - the systems side looks fine to me.

I'll also look into the VSAN Health Check Plugin.  Looks like it's now integrated with Vrealize Operations Manager, which we have as well.

FYI - I googled for the URL above I initially got a 404 error for - looks it has been redirected to Cisco Nexus 7000 Series NX-OS Multicast Routing Configuration Guide - Configuring IGMP Snooping [Cis....  But that is pretty general; not too helpful.

Reply
0 Kudos
burvil
Enthusiast
Enthusiast
Jump to solution

I ended up removing and re-adding the hosts, and in the process, I think I found the networking wasn't set up properly.  I think there were some network hiccups that caused the systems to be offline, necessitating the re-add.  It's working fine now.

Reply
0 Kudos