tiptopf
Contributor
Contributor

ESXi(6.5): unidirectional traffic via VST - won't bridge tagged traffic from the uplink

Jump to solution

Hi all

please I would appreciate any inputs on the following:

ESXi running 6.5.0 Update 2 (Build 8294253), CIMC 3.1

Topology

VM----portgroup(vlan ID: 100)-----Standard vSwitch-----vmnic(10G)-----trunk10G(vlan 100 allowed)----catalyst----SVI100

The problem:

SVI100's initiated ARP request is seen on the VM and the Catalyst learns VM's MAC address, however, the VM won't learn SVI100's MAC.

pktcap-uw --switchport on vmnic(10G) shows the ARP responses are being received from the Catalyst, however they are just not bein bridged down the VM's portgroup - as not seen in the capture while doing pktcap-uw --switchport on the portgroup.

If I add static ARP entry on the VM, it still won't reach the SVI100.

If I change the vlan id to 4095 i.e. trunk on the portgroup, then I see the tagged ARP responses, but VGT is not the objective. If I change it to 0 and convert the trunk on the catalyst to the access port - all good.

In here they mention:

Enable VLAN tagging specifying 4095 as value:

Now your ESX(i) is VLAN architecture aware

but nowhere in the KB it is stated that this is required. My current culprit is the base license, but it is dubious that such a basic feature is policed.

Thanks in advance for any inputs.

Tags (2)
0 Kudos
1 Solution

Accepted Solutions
tiptopf
Contributor
Contributor

that was it:

I noticed that ARP requests from the VM come with COS=0, whereas the catalyst responds with COS=5, but this is not the problem as defining a static ARP entry on the VM won't help.

a closer look revealed the ICMP replies were tagged with a COS=5 as well, that is why the static ARP entry on the VM did not do.

somehow I missed that the "Drop" is a separate argument, I assumed the dropped packets were not to be seen.

Resolution.

View solution in original post

0 Kudos
3 Replies
a_p_
Leadership
Leadership

Welcome to the Community,

from a first look at your configuration, and what you write I cannot see an obvious configuration issue.

The only think I can think of, is that the physical switch port's default VLAN is also 100, in which case the VLAN-ID on the virtual port group has to be set to 0.

André

0 Kudos
tiptopf
Contributor
Contributor

Hi André

thanks for chiming in, the evidence I collected does not corroborate that vlan 100 is the native i.e. untagged vlan:

SPAN off of the catalyst's trunk shows the correct dot1q tag being sent and received

the catalyst1 learns VM's MAC address

tried with a different vlan to see if there is a programming error and vlan 100 is the native vlan

besides, catalyst port trunk config does not have it defined that way and as such vlan 1 is the native vlan by default in this case.

I noticed that ARP requests from the VM come with COS=0, whereas the catalyst responds with COS=5, but this is not the problem as defining a static ARP entry on the VM won't help.

The server has its mgmt NIC connected to yet another catalyst2, untagged in the same vlan 100, which in turn has a trunk towards the initial catalyst1. There are multiple other VMs(not the problematic one) which are connected to that mgmt vmnic by way of having an untagged portgroup on the default vSwitch. Now to my amusement, I created an SVI100.2 on the catalyst2 and the problematic VM was able to ping it, but not the other way around. This eliminates the theory of not licensed tagged functionality and the need to have 4095 under CIMC for management.

The problematic VM is connected to a non default vSwitch. Do all vSwitches share the same MAC address table?

I checked the physical path and it is correct: problematic VM goes out via 10G to cat1, cat2 learns VM's MAC via the trunk to cat1. It is not like VM takes a backdoor via the mgmt vmnic. I also shutdown the mgmt vmnic on cat2 and left only the 10G from the cat1 - it did not help. But then I do not know how the vSwitch behaves - will it flush all the MACs it learned or is there any timeout. Is there an actual learning at all happening?

Also what perplexes me is why the connection happens only one way - cat1 learns VM's MAC but neither it can ping the VM nor the VM can ping its SVI100.1

why VM can ping cat2's SVI100.2 but cat2 cannot ping the VM.

The lack of visibility into the mac address-table on the vSwitch makes it difficult to troubleshoot.

0 Kudos
tiptopf
Contributor
Contributor

that was it:

I noticed that ARP requests from the VM come with COS=0, whereas the catalyst responds with COS=5, but this is not the problem as defining a static ARP entry on the VM won't help.

a closer look revealed the ICMP replies were tagged with a COS=5 as well, that is why the static ARP entry on the VM did not do.

somehow I missed that the "Drop" is a separate argument, I assumed the dropped packets were not to be seen.

Resolution.

0 Kudos