VMware Cloud Community
TomekL79
Contributor
Contributor

intermittent connectivity during high load, after couple of hours of ESXi 6.7.3U3b uptime

Hello,


I've came up to very strange issue on ESXi 6.7.3U3b, AMD 7302P, Supermicro WR1014 platform. To the point, that after couple of days of troubleshooting I had to pull back from virtualization and move firewall (latest pfSense 2.4.4 on FreeBSD 11) to physical machine.

In short - the VMWare and/or pfSense is cutting off layer 3 network connectivity (ARPs are getting through) at random times, for 1-30 seconds (depending on load), on VMXNET3 interfaces connected to switches which are connected to physical NICs. Traffic within virtual switches is not affected, as well as traffic to management interface. It just happens for firewall, only during load, and... after couple of hours of ESXi uptime. Rebooting ESXi "fixes" the problem for couple of hours and it cannot be triggered with high load at the beginning. During downtime, with tcpdump on firewall I don't see any incoming traffic from physical NICs, but ARPs.

What I've checked so far (and didn't help): I've increased RX/TX buffers on pfSense (no drops were reported on interfaces). I've disabled TSO, LRO and hardware offloading (all three boxes checked). I've did clean installation of pfSense (2.4.4, 2.4.5 and 2.5.0 devel) - without any filters, just as routing platform - the issue remains. On VMWare - there are no drops vSwitches (esxtop), there are some drops on interfaces, but they are not increasing during downtimes. On VMWare i've disabled Pause RX/TX/autonegotiation, TSO - it didn't help.

Because of very poor network performance I had to revert to physical platform for firewall. Have anyone experienced something similar? Did I overlook something in configuration? What is really weird and problem for troubleshooting, that the reboot of ESXi "fixes" the issue for couple of hours.

Kind regards,
Tomek

Tags (1)
0 Kudos
0 Replies