karel_vins
Contributor
Contributor

Esxi 6.7 VM packet drops on standard vSwitch with Route based on IP Hash

Hello,

I am running VMware ESXi, 6.7.0, 10764712 - upgraded and clean installations, tried different HW (Cisco UCS C220 M3 and SuperMicro servers with Cisco or Intel NICs 10GbE).

I found that there is a packet loss (in percents) in some VMs when both uplinks are connected. If one or another uplink is disconnected, packet loss dissappear.

In testing env is only one guest - Centos 5 with latest VM Tools, vNIC VMXNET3 (tried E1000, not better) - copy of the most affected VM.

This looks like similar problem - Route based on IP hash for Windows Server, but without response.

Or this:

https://www.reddit.com/r/vmware/comments/93fbuw/massive_vsan_latency_increase_on_upgrade_to_67/

There was no problem before update to 6.7.

Regards,

Karel

0 Kudos
6 Replies
Demouge
Contributor
Contributor

We experienced the same issue.

Hosts on VMware ESXi, 6.7.0, 11675023.

Seems to be a bug with IP HASH.

We disabled etherchannel on our Cisco 3850 stack and using Port ID instead.

This resolves the packetloss. Spoke to a network engineer and confirmed the bug.

They can create a workaround for you remotely (you can't do this yourself), but we have to migrate all VM's off 1 machine, call support, apply the fix, migrate VM's from another host, call support, apply the fix, etc etc. We'll wait for 6.7 U2, network engineer says issue is resolved in that update.

0 Kudos
BennyFx
Contributor
Contributor

It seems like we have the same issues with the following setup:

Hosts on VMware ESXi 6.7.0 Build 10764712

HPE DL380 Gen9 and Gen10 Hosts

Cisco 3850 Stack

I will try the workaround from Demouge.

Do you know when a official fix or 6.7 U2 will be released?

0 Kudos
_BoRiS_
Contributor
Contributor

Hello all,

Same problem vsphere 6.7U1 vsan 6.7U1, some packet lost on VDS with LACP.

If I ping from a physical server my ESXs management vmkernel (VDS with LACP) on 50.000 ping I lost 1 to 5 ping...

For the moment I can't upgrade to 6.7U2 because my DELL node are not on the VSAN 6.7U2 HCL for the moment.

No packet drop on switchs side, no packet drop on vmnic side... but some ping are lost...

Perhaps it's 6.7U1 related, or perhaps it's my intel X710-T NIC... I don't know

Boris

0 Kudos
Dark345
Contributor
Contributor

I'm experiencing same problem with 6.7 u2. I'm using a single standard vSwitch with 2physical uplinks toward an EtherChannel on a Cisco3750, a VM running beneath this host can't reach another VM running beneath another hypervisor, this is the only case. Except this, everything is working great. But I needed to switch back to Route based on originating port ID to mitigate this problem.

0 Kudos
RagsRachamadugu
Contributor
Contributor

We are seeing this too on VMware ESXi, 6.7.0, 13006603. In our setup, the ESXi is part of a DVS with a single LAG configured to use LACP active and I see 80%+ drop in ping to even the default gateway. The ESXi is attached to a single Cisco leaf - no VPC, just a LAG. I tried with few different load-balancing options without any luck.

Can someone please respond here?

Thanks Rags

0 Kudos
ManivelR
Enthusiast
Enthusiast

I'm also facing the same issue.

I got some clue.

From this first article it seems this is known issue in vsphere 6.7 update 1 express patch 7

Reddit - vmware - ESX 6.7 - intermittently lose network connectivity over Backup NIC

From this community, it seems this is known issue in vsphere 6.7 update 1 express patch 5.

Esxi 6.7 VM packet drops on standard vSwitch with Route based on IP Hash

My doubt:- I'm using vsphere 6.7 update 1.Is this known issue in the update 1 as well? Any ideas?

The below link and article relates to vsphere 6.7 update 2 release notes. As per this below article, it has been fixed in vsphere 6.7 update 2.

https://docs.vmware.com/en/VMware-vSphere/6.7/rn/vsphere-esxi-67u2-release-notes.html

PR 2238134: When you use an IP hash or source MAC hash teaming policy, the packets might drop or go through the wrong uplink When you use an IP hash or source MAC hash teaming policy, some packets from the packet list might use a different uplink. As a result, some of them might drop or not be sent out through the uplink determined by the teaming policy. This issue is resolved in this release.

Thank you,

0 Kudos