VMware Networking Community
chadc1979
Enthusiast
Enthusiast

Poor performance with Overlay on NSX-T 2.5.1 with vSphere 6.7U3

Hi,

I'm starting to test out Geneve backed segments and have noticed I can only get about 2 VMs on a segment before I start having performance issues like gpupdate doesn't complete, SMB isn't accessible, domain authentication issues etc etc.

I can ping the VMs the entire time all of that is happening and if I turn everything off except one VM then everything works fine on that VM but as soon as I have 2 or 3 VMs running the start having the above issues.

I thought maybe a network driver issue and updated to the latest available gfle3 driver version with no luck, I imagine this is an issue with a segment profile but I'm hoping someone else has already figured out what.

I'm using QLogic BCM57810 NDC cards in Dell blade servers, ESXi and vCenter are on the latest version of 6.7U3.

I don't know if the same issues are happening on VLAN backed segments as my management and edge clusters aren't part of the N-VDS switch.

I don't have ENS enabled on the overlay zone either (I saw you have to have specific NICs for that and a driver) and it's more for a Telco.

Thanks for any help or pointers anyone can offer.

0 Kudos
4 Replies
Sreec
VMware Employee
VMware Employee

You should do a proper health-check on this cluster. As per your description , when the load is increasing you start facing performance issue. My feedback is below

1. Ensure you are using custom images on ESXI servers

2. Check the driver and firmware versions and ensure they are supported as HCL

3. When you experience performance hit, do perform a ICMP test and confirm if there are any potential drops at Guest level or Host level . Test should be done by placing VM's on same and different host in same chassis and difference chassis, considering the blade architecture . I wan't the traffic to hit TOR  switch and come back also limit the test to L2 traffic as of now.

4. Do check the bandwidth test on all NIC's https://www.virtuallyghetto.com/2016/03/quick-tip-iperf-now-available-on-esxi.html for VLAN&TEP traffic.

5. Do check TOR switches to confirm there are no frame drops or CRC errors.

6. Do packet-capture only if you are experiencing potential drops , it should be done at Host level and Switch level

Cheers,
Sree | VCIX-5X| VCAP-5X| VExpert 7x|Cisco Certified Specialist
Please KUDO helpful posts and mark the thread as solved if answered
0 Kudos
chadc1979
Enthusiast
Enthusiast

I've done 1 and 2, I'm on the right firmware for the updated driver I installed.

I'm going to try building out a test environment on a DVS backed switch on a different cluster where I don't have NSX-T VIBs installed just to make sure I don't have an issue with my sysprep images.

If all goes well there then I'll migrate the test workload to hosts with the NSX-T VIBs and move them from VLAN backed to GENEVE backed, that pretty much tests everything from DVS to N-VDS and VLAN backed segment to GENEVE.

0 Kudos
mauricioamorim
VMware Employee
VMware Employee

This does not seem like a performance issue. 2 VMs should work and there seems to be some underlying issue here.

Can you tell us some more about your test? How have you configured the overlay segments? Where are they attached? Is source and destination in the test all under the same overlay segment or does it involve north/south routing through the Tier-0 Gateway? Are there multiple ESXi nodes? Please give us as much detail as possible.

0 Kudos
theravelund
Contributor
Contributor

This maybe a stupid question but do you  have jumbo frame (atlease 1700 mtu ) configured on your switch/router and also VDS/vSwitch connection the overlay to the Edge?

0 Kudos