VMware Cloud Community
wheelz311
Contributor
Contributor

Virtual Router using VMXNET3 extremely slow

I have a virtual linux router (vyatta) set up with a vNIC for each vLAN. I have a windows box in one vLAN and another windows box in another. The virtual router is routing between the two. Pings works great. Any kind of larger traffic is extremely slow - unusable. Here's the thing, all 3 VMs are on the same vSwitch and the same host. Apparently this may be a known issue as this indicates:

http://www.vyatta.org/forum/viewtopic.php?t=3030&postdays=0&postorder=asc&start=0

I found other references on the web with similar problems whenever routing with a VM, versions 3.5 through 4.0 update 1. Does anyone know what could be causing this or how to fix it. Some have said you can disable TCP offloading on all guests and that is a workaround but that isn't the best solution.

Any help is appreciated! Thanks.

0 Kudos
7 Replies
wheelz311
Contributor
Contributor

I have verified that this only happens with traffic that is routing internally, meaning it only occurs when the traffic is between virtual machines on the same host or between a host and a virtual machine on that host. Any traffic that gets routed through the outside network is not affected.

I also verified that disabling TCP Offloading on the sender's side does work. The question is whether there is a fix that does not require doing this on all virtual machines and the ESXi host (not sure how you would do this on the host though).

0 Kudos
wheelz311
Contributor
Contributor

Additional information: It happens even if it crosses vSwitches (and therefore pNICs). No luck on a fix other than the work arounds. There are a ton of advanced settings that can be tweaked but I have no idea where to start or if any of those would work. I used esxcfg-vmknic -l to verify that TSO MSS was set to 65535 (so TSO is supported on my pNICs). Anyone have an idea?

0 Kudos
DSTAVERT
Immortal
Immortal

I would file an SR with VMware. I would also press vyatta to work on this with VMware.

-- David -- VMware Communities Moderator
0 Kudos
Scissor
Virtuoso
Virtuoso

From the linked post, it looks like one workaround is to change your vNIC's from VMXNET3 to E1000. I wonder if changing to VMXNET2 would also work?

0 Kudos
Scissor
Virtuoso
Virtuoso

Found this tidbit in the ESXi 4.1 release notes that might apply to your situation:

http://www.vmware.com/support/vsphere4/doc/vsp_esxi41_vc41_rel_notes.html

Poor TCP performance can occur in traffic-forwarding virtual machines with LRO enabled

Some Linux modules cannot handle LRO-generated packets. As a result, having LRO enabled on a VMXNET 2 or VMXNET 3 device in a traffic forwarding virtual machine running a Linux guest operating system can cause poor TCP performance. LRO is enabled by default on these devices.

Workaround: In traffic-forwarding virtual machines running Linux guests, set the module load time parameter for the VMXNET 2 or VMXNET 3 Linux driver to include disable_lro=1.

0 Kudos
nwsmith
Contributor
Contributor

We have hit the same problem with poor TCP performance using Centos 5.5 as a virtual router. The disable_lro=1 fix works, but the problem is passing that option when the module first loads. We fixed that by patching and recompliling the vmxnet module. I have posted the details on my blog, here:

http://nwsmith.blogspot.com/2010/07/patching-vmxnet-to-disable-lro.html

Regards

Nigel Smith

0 Kudos
Scissor
Virtuoso
Virtuoso

wheelz311 -- Thanks for introducing me to Vyatta. It is a lot of fun to play around with.

Just to close the loop on this thread, I see from the linked Vyatta thread that this problem has been fixed in the Vyatta 6.1 release (Larkspur) released in August.

0 Kudos