Re: ESXi 6.5 - vSphere Distributed Switch VLAN Tru...

aj800 · ‎11-29-2018

After downloading, staging and applying patches to the first ESXi host in a vSphere 6.5 host cluster, the host, after exiting Maintenance Mode, showed 2 triggered alarms:
vSphere Distributed Switch VLAN trunked status
vSphere Distributed Switch MTU supported status

Upon further review of vDS settings and details, selecting vDS --> Monitor Tab --> Heath Tab, there was a warning of 10 issues (started at 12, but apparently the 'Out of sync' issue was resolved): 'Unsupported VLAN' and 'MTU Mismatch' for each of the 5 hosts. When selected each host, the panel below shows 4 of 6 vmnics as "unsupported" but there are VMs running on this host that don not seem to have any connectivity issues (at the moment). How do I resolve this? We are currently running a vDS of 5.5 (we have not upgraded yet and I didn't notice until after the patching was completed), although the vCenter (VCSA) and ESXi hosts are at 6.5.0. This is in a production environment and I would need to correct this as soon as possible so I may continue with the patching to address a critical vulnerability. Any help is appreciated.

sk84 · ‎11-29-2018

What exactly does it show when you select a host from the list? It should normally show you the VLAN ID that is not supported. The error usually occurs if you have configured a dvPortgroup with a VLAN ID that does not exist on the physical switch port of an uplink. This usually triggers also the MTU warning for the same VLAN ID.

--- Regards, Sebastian VCP6.5-DCV // VCP7-CMA // vSAN 2017 Specialist Please mark this answer as 'helpful' or 'correct' if you think your question has been answered correctly.

aj800 · ‎11-29-2018

From the list when you select a host (vDS --> Monitor --> Health --> VLAN Tab), it shows under the "VLAN Trunk" column a value of "0" and under the "VLAN Status" column a value of "Not supported" for 4 of the 6 vmnics listed. The other 2 show "Supported". The vDS port groups are not assigned Vlan IDs/numbers.

For the MTU tab, it shows "Not supported" for the same 4/6 vmnics.

Is this because the vDS is running 5.5, still, in the 6.5.0 environment, and would upgrading first correct this?

sk84 · ‎11-29-2018

Okay. Normally there is a VLAN ID instead of 0, which is missing on the physical switch port. For example: You have created a port group with VLAN ID 10, but VLAN 10 is not configured on the switchports of some uplinks. In this case, the number 10 would be there.

If this is 0, it usually means that you have portgroups where VLAN is set to "none". So the dvSwitch sends the packets from these port groups untagged to the physical switch port. The dvSwitch healthchecks check this, too, but if there is no Native VLAN configured on a trunk port of the physical switch, these frames are dropped and the healthcheck warns about this.

I therefore suspect that the switchport configuration of some uplinks is different. Especially the Native VLAN configuration.

--- Regards, Sebastian VCP6.5-DCV // VCP7-CMA // vSAN 2017 Specialist Please mark this answer as 'helpful' or 'correct' if you think your question has been answered correctly.

aj800 · ‎11-30-2018

The Native VLAN on the physical switch (HP Flex10) shows as VLAN 1, which I suppose is the default setting. There are 3 Vlans, including VLAN 1 going to this environment. So what is the fix for this if that's the case?

sk84 · ‎11-30-2018

Unfortunately, I'm not familiar with HP switches, especially the Virtual Connect modules. With our Cisco network infrastructure, I simply created a VLAN as a native VLAN on each switchport where an ESXi uplink is connected and the problem was solved. We have no untagged traffic in our infrastructure, so using a "dummy" native vlan was an acceptable workaround.

With Cisco it would be:

switch# conf t
switch(config)# vlan 123
switch(config-vlan)# name VMWARE-NATIVE-DUMMY
switch(config-vlan)# exit
switch(config)# int Ethernet1/35
switch(config-if)# switchport trunk allowed vlan add 123
switch(config-if)# switchport trunk native vlan 123
switch(config-if)# end
switch# wr

(Interface must be changed)

--- Regards, Sebastian VCP6.5-DCV // VCP7-CMA // vSAN 2017 Specialist Please mark this answer as 'helpful' or 'correct' if you think your question has been answered correctly.

aj800 · ‎12-06-2018

So, I haven't gotten to the network settings changes or analysis yet since it's a blade server and I'll have to investigate how that was and should be set up vs. how it current is. I also wanted to upgrade the vDS to see if that made any difference...

So I upgraded the vDS from version 5.5.0 to 6.5.0, hoping that might clear something up compatibility-wise. It did not. Now, of the 5-host cluster, 3 of the other hosts now show the same Critical Alerts as the original one I had the issue with, but oddly, one of them does not. That host is running the same OS as the others (minus the one host I had patched) and appears to be configured the same way also. So now 4 of the 5 hosts show the Alert. I'll have to review the network settings on both sides but if you or anyone else has any input or recommendations beyond what's already been recommended here, I'm all ears. Thanks.

aj800 · ‎05-17-2019

Hi. I'm still working on this since I haven't touched it in a while. I still have the critical alerts since I wanted to get to the bottom of this before acknowledging them. Traffic seems to be working fine, despite the alerts persisting.

There are 2 physical switches going to the vDS: HP ProCurve --> Flex-10 pair switch --> vDS

The ProCurve pair is trunking 3 Vlans to the Flex-10s:

2 ports in a trunk (x2, 4 total, 2 per switch)

Vlan 100 Untagged

Vlan 200 Tagged

Lan 300 Tagged

The Flex-10s configuration shows the same:

6 nics per host x 5 hosts (30 uplinks to vDS)

Vlan 100 (Native)

Vlan 200

Vlan 300

vDS:

dvUplink Group 1 (6 links x 5 hosts = 30 total)

Port Group A (Vlan ID = 0)

Port Group B (Vlan ID = 0)

Port Group C (Vlan ID = 0)

These links from the Flex-10s are all trunked to a single dvUplink group on the vDS, and then there are a few vDistributed Port Groups and each of those have no Vlan ID assigned, as mentioned (so, Vlan ID = 0)

For some reason, all 5 of the hosts appear to be configured the same but only one of them now shows no critical alerts. I don't recall acknowledging the alerts.

I'm thinking of testing out just assigning the matching Vlan IDs to the Port Groups as recommended, but I'd like more info before I break something.

I read at the link below that if tagging is done on the physical switch, the Port Groups' Vlan ID on the Virtual Switch should be zero, but I'm not sure if this applies here or if they mean in a situation where the vDS is connecting to an ACCESS port in a Vlan on the physical switch, or something else. Any clarification or additional help would be great, based on the detail I've added. Thanks.

VMware Knowledge Base

SebastianGrugel · ‎10-09-2019

Hi. Did you resolve your issue ? with unsuported VLAN 0 ?

I have that same on rack servers which we are using for VSAN.

Strange becasue other blade hosts on that same vDS dont have this warning.

Sebastian

vExpert VSAN/NSX/CLOUD | VCAP5-DCA | VCP6-DCV/CMA/NV ==> akademiadatacenter.pl

aj800 · ‎10-18-2019

Hi. I still haven't figured out why only one host did not show any alerts like the others when it appears to be configured the exact same.

We're now migrating to a set of new hosts & decommissioning the old ones (hardware aging), so what I'm doing now is configuring the new host cluster in the same datacenter the same way as the the original cluster, except with new port groups on the vDS that actually have the Vlan IDs assigned (with similar names to the originals).

Though we still have some datastore connectivity to complete before we begin the VM migration to these new hosts, the networking for each new host was set up on the same vDS with these new port groups and I don't get the alerts that the other hosts are still getting. I actually did see the alerts at one point on one or two of them after adding the host & configuring networking, but after entering maintenance mode & rebooting the hosts to ensure everything comes up ok, and to clear up some of the lingering alerts that new installations tend to display, I don't see those alerts anymore, just the HA alerts since we haven't configured the datastores yet.

So, I hope my theory proves true that the Vlan ID = 0 on the original port groups is what the issue was. I'll report back once it's fully functional.

All

ESXi 6.5 - vSphere Distributed Switch VLAN Trunked and MTU Supported Status warnings