Solved: NSX Transport - MTU Issue, but not in Underlay!

HybridNetArchit · ‎02-21-2019

Hi, got an odd one that hopefully may have an easy answer

Scenario, Dual Site (Cross vCentre NSX) Deployment, have created an active-active topology (both Sites have local VMs and both use localised ingress/egress). Two ULS, although each used by default in a particular site.

Applications sitting on these VMs are not working as expected when flows have to travel inter-site (eg app in site A trying to consume shared DB in Site B). Investigation using "ping -l -f" from the VMs showed us that any communication inter-DC failed if the byte size was 1423 or above. 1422 and below we have no problems - we have a nice stable network. Physical network is set to use Jumbo Frames end to end (so plenty of headroom for VxLAN overhead)

VMs on same site but on different ESXI hosts have no issues with large MTU payload (tested above 1600). So VxLAN working locally.

VMs sending large MTU payload Northbound out of NSX via the ESGs also have no problem - again tested in excess of 1600 bytes. So again VxLAN working locally (Payload ULS via UDLR via Transit ULS to ESGs and out into physical network via VLANs to physical upstream router.

Diagram below shows the setup at high level

So in above diagram

VM1 to VM2 - No problem

VM3 to VM4 - No problem

VM1 and VM2 to Site A Physical Router - No Problem

VM3 and VM4 to Site B Physical Router - No Problem

VM1 to VM3 or VM4 - MTU Issue

VM2 to VM3 or VM4 - MTU Issue

VM3 to VM1 or VM2 - MTU Issue

VM4 to VM1 or VM2 - MTU Issue

So the above points to an issue in the physical underlay network between sites.- however we see no issue based on the following:

From the Physical switches in site A, we can source ping with data payloads of approaching 9000 bytes to Physical Switches in Site B. Specifically this happens within the routed network that VxLAN utilises (NSX Transport network) so we know that anything sourced from a ESXI host VTEP from Site A will use this same logical path when heading towards its destination Host VTEP in Site B.

Diagram below to assist in describing this:

So for example I can source a Ping with a payload of 8900 bytes from Interface A2 sitting in the source NSX Transport Network of Site A to Interface B2 sitting in the destination NSX Transport Network of Site B - This works fine, and also for completeness on the second path I can source a Ping with a payload of 8900 bytes from Interface A3 sitting in the source NSX Transport Network of Site A to Interface B3 sitting in the destination NSX Transport Network of Site B. So in this test we are able to send large MYU payload inter-DC and also over the L3 interfaces involved as part of the end to end NSX transport underlay.

So, the question is thus - what is going on - locally within each site we see no issues, and this is between hosts, issue arises using NSX transport (VxLAN) between sites, but on a underlay that we can prove supports more than enough MTU.

Suggestions welcome

NSX Version is 6.4.1

HybridNetArchit · ‎06-20-2019

In the end it was a bug in the physical switch, it was reporting IP MTU inaccurately as being configured, but on closer inspection it was missing or had been lost.

Re-applied and now working again. Odd, will keep an eye on that, seems buggy, other identical switches built with same script didn't have the issue.

View solution in original post

Sreec · ‎02-21-2019

From the Physical switches in site A, we can source ping with data payloads of approaching 9000 bytes to Physical Switches in Site B. Specifically this happens within the routed network that VxLAN utilises (NSX Transport network) so we know that anything sourced from a ESXI host VTEP from Site A will use this same logical path when heading towards its destination Host VTEP in Site B.

If feasible I will test it with a laptop hooked to Switch-A (assign a IP from VXLAN Transport VLAN Subnet) and test the MTU check all the way till Site-B - If this Test is successful- Network is ruled out as per my understanding and in that case I would like to know below configs

1. VTEP configs - Single/Multi and load balancing policies for both the sites ?

2. Server type - Blade/Rack ?

Cheers,
Sree | VCIX-5X| VCAP-5X| VExpert 7x|Cisco Certified Specialist
Please KUDO helpful posts and mark the thread as solved if answered

HybridNetArchit · ‎06-20-2019

In the end it was a bug in the physical switch, it was reporting IP MTU inaccurately as being configured, but on closer inspection it was missing or had been lost.

Re-applied and now working again. Odd, will keep an eye on that, seems buggy, other identical switches built with same script didn't have the issue.

All

NSX Transport - MTU Issue, but not in Underlay!