VMware Cloud Community
billybobSSD
Contributor
Contributor

Slow VMXNET3 performance on 10gig connection

I have an ESXi server with an Intel X520-DA2 10 gig adapter in it. It has an iSCSI data store connected over one port and VM traffic over the other port. The iSCSI speed just couldn't be better but the problem seems to be that none of my VM's will do over 300 megabits/sec. Their all using VMXNET3 adapters. I have gone so far as to hook the second 10gig port directly up to another standalone windows server to eliminate the network/switch as a bottleneck via an SFP+ cable, but am still limited to the ~300 megabit ceiling. Any clues to what could be causing this? Thanks in advance!

Tags (3)
Reply
0 Kudos
68 Replies
HendersonD
Hot Shot
Hot Shot

Reply
0 Kudos
ROE
Contributor
Contributor

Problem solved!

Actually a bit embarrasing, since we all know troubleshooting should be on as clean installs as possible :smileyblush:

Symantec Endpoint Protection 12 was causing the problem. Even with Network Threat Protection disabled we still couldn't get more than around 500Mbit/s throughput.

Completely uninstalling SEP jumped the speeds to expected levels.

Reply
0 Kudos
ouderblade
Contributor
Contributor

Having a similar issue, however we are experiencing sever performance degredation almost to a point where users cannot do their work as our server is terribly slow.

Our issues started when i upgraded to esxi 4.2 u2 and the virtual machines to version 7 with paravirtual scsi and vmxnet3.

heres a typical output of my scenario

[3932] 87.0-88.0 sec  20.4 MBytes   171 Mbits/sec
[3932] 88.0-89.0 sec  26.6 MBytes   223 Mbits/sec
[3932] 89.0-90.0 sec  8.87 MBytes  74.4 Mbits/sec
[3932] 90.0-91.0 sec  10.9 MBytes  91.8 Mbits/sec
[3932] 91.0-92.0 sec  15.5 MBytes   130 Mbits/sec
[3932] 92.0-93.0 sec  8.10 MBytes  68.0 Mbits/sec
[3932] 93.0-94.0 sec  3.53 MBytes  29.6 Mbits/sec
[3932] 94.0-95.0 sec  3.63 MBytes  30.4 Mbits/sec
[3932] 95.0-96.0 sec  3.64 MBytes  30.5 Mbits/sec
[3932] 96.0-97.0 sec  24.9 MBytes   209 Mbits/sec
[3932] 97.0-98.0 sec  36.2 MBytes   304 Mbits/sec
[3932] 98.0-99.0 sec  41.1 MBytes   345 Mbits/sec
[3932] 99.0-100.0 sec  37.9 MBytes   318 Mbits/sec.

Very erratic and jumping below 30Mbits/sec. Im in the process of removing antivirus software (sophos) to see if that resolves anything.

Reply
0 Kudos
janusz_w
Contributor
Contributor

Slow VMXNET3 performance on 10gig connection (and 1Gb too)

Nobody else has had this issue?  EVERYBODY HAS :smileyangry:

I do some transfer tests using iperf between two virtual W2008R2 machines.

(no file transfer tests,  iperf tests only LAN - so no disk/raid/datastore… issue)

Virtual machine LAN drivers - VMXNET3 (W2008R2 report  as 10GB drivers)

No any change of the default VMXNET3 adapter settings (default).

First machine on ESXi 4.0 farm (farm 4xIBM x3610  with 1Gb interfaces)

Second machine on ESXi 5.0 (farm 4xCISCO C260 with 1Gb interfaces)

Farms on the same localization (but 3 Cisco 1 Gb switches between)

Results ( in MBytes/s !!! 😞

Standard connection (no change in iperf TCP parameters - default)

------------------------------------------------------------

[164]local 172.19.220.xxx port 49353 connected with 172.19.230.xxx port 5001

[ ID]   Interval         Transfer         Bandwidth

[164]  0.0- 1.0 sec  23.1 MBytes  23.1 MBytes/sec

[164]  1.0- 2.0 sec  22.0 MBytes  22.0 MBytes/sec

[164]  2.0- 3.0 sec  23.0 MBytes  23.0 MBytes/sec

[164]  3.0- 4.0 sec  22.0 MBytes  22.0 MBytes/sec

[164]  4.0- 5.0 sec  22.1 MBytes  22.1 MBytes/sec

[164]  5.0- 6.0 sec  22.4 MBytes  22.4 MBytes/sec

[164]  6.0- 7.0 sec  22.4 MBytes  22.4 MBytes/sec

[164]  7.0- 8.0 sec  22.1 MBytes  22.1 MBytes/sec

[164]  8.0- 9.0 sec  21.9 MBytes  21.9 MBytes/sec

[164]  9.0-10.0 sec  22.2 MBytes  22.2 MBytes/sec

[164]  0.0-10.0 sec  223 MBytes  22.3 MBytes/sec 

very bed Smiley Sad

Changed in iperf - TCPWindowsSize from def. 8kB to 56kB on Client  side

------------------------------------------------------------

[164]local 172.19.220.xxx port 49356 connected with 172.19.230.xxx port 5001

[ ID]   Interval      Transfer    Bandwidth

[164]  0.0- 1.0 sec  81.1 MBytes  81.1 MBytes/sec

[164]  1.0- 2.0 sec  79.0 MBytes  79.0 MBytes/sec

[164]  2.0- 3.0 sec  76.9 MBytes  76.9 MBytes/sec

[164]  3.0- 4.0 sec  83.7 MBytes  83.7 MBytes/sec

[164]  4.0- 5.0 sec  85.3 MBytes  85.3 MBytes/sec

[164]  5.0- 6.0 sec  79.2 MBytes  79.2 MBytes/sec

[164]  6.0- 7.0 sec  83.5 MBytes  83.5 MBytes/sec

[164]  7.0- 8.0 sec  79.4 MBytes  79.4 MBytes/sec

[164]  8.0- 9.0 sec  81.2 MBytes  81.2 MBytes/sec

[164]  9.0-10.0 sec  77.9 MBytes  77.9 MBytes/sec

[164]  0.0-10.0 sec  807 MBytes  80.5 MBytes/sec 

Much better Smiley Happy    (up to 4 times ?)

Changed in iperf  TCP Buffer Length to 2MB on Client side

------------------------------------------------------------

[164]local 172.19.220.xxx port 49363 connected with 172.19.230.xxx port 5001

[ ID] Interval      Transfer    Bandwidth

[164]  0.0- 1.0 sec  106 MBytes 106.0 MBytes/sec

[164]  1.0- 2.0 sec  96.0 MBytes  96.0 MBytes/sec

[164]  2.0- 3.0 sec  100 MBytes 100.0 MBytes/sec

[164]  3.0- 4.0 sec  100 MBytes 100.0 MBytes/sec

[164]  4.0- 5.0 sec  96.0 MBytes  96.0 MBytes/sec

[164]  5.0- 6.0 sec  68.0 MBytes  68.0 MBytes/sec

[164]  6.0- 7.0 sec  88.0 MBytes  88.0 MBytes/sec

[164]  7.0- 8.0 sec  92.0 MBytes  92.0 MBytes/sec

[164]  8.0- 9.0 sec  88.0 MBytes  88.0 MBytes/sec

[164]  9.0-10.0 sec  64.0 MBytes  64.0 MBytes/sec

[164]  0.0-10.0 sec  900 MBytes  89.7 MBytes/sec  

90MBytes/s on 1Gb interface –  900Mbytes/s - what else we need :)))

And now all together…

Changed in iperf TCP TCPWindowsSize to 56k and Buffer Length to 2M

------------------------------------------------------------

[164]local 172.19.220.xxx port 49365 connected with 172.19.230.xxx port 5001

[ ID] Interval      Transfer    Bandwidth [164]  0.0- 1.0 sec  92.0 MBytes  92.0 MBytes/sec

[164]  1.0- 2.0 sec  94.0 MBytes  94.0 MBytes/sec

[164]  2.0- 3.0 sec  82.0 MBytes  82.0 MBytes/sec

[164]  3.0- 4.0 sec  104 MBytes 104.0 MBytes/sec

[164]  4.0- 5.0 sec  98.0 MBytes  98.0 MBytes/sec

[164]  5.0- 6.0 sec  96.0 MBytes  96.0 MBytes/sec

[164]  6.0- 7.0 sec  90.0 MBytes  90.0 MBytes/sec

[164]  7.0- 8.0 sec  98.0 MBytes  98.0 MBytes/sec

[164]  8.0- 9.0 sec  94.0 MBytes  94.0 MBytes/sec

[164]  9.0-10.0 sec  104 MBytes 104.0 MBytes/sec

[164]  0.0-10.0 sec  954 MBytes  95.1 MBytes/sec

  :))))))

No ESXi tuning !

Iperf change TCP parameters only ‘insaid‘ virtual machine !!

So now VMware staff !!!

Prepare 'LAN Best Practices' where to put this parameters

on W2008R2/ W2012/ RHEL... for VMXNET3, E1000.... (register, NIC driver param...)

Perhaps it help Smiley Happy

Janusz

PS: sorry for my English

Reply
0 Kudos
Scissor
Virtuoso
Virtuoso

Some thoughts after briefly skimming this thread:
  • Configure your test Guests with 1 vCPU each to rule out SMP scheduling overhead as the cause of the problem.  Ensure that the ESX Host CPU utilization is ok before starting testing. 
  • I would suggest getting rid of Jumbo frames unless you can confirm that every single piece of equipment between your two Guests has been properly configured with the same Jumbo frame parameters (only really applicable when Guests are on different Hosts). 
  • Why did you decide to use UDP when testing with iPerf [1]?  Please re-run your tests using TCP.  Also, at higher speeds increasing the iperf TCP Window Size parameter ( -w ) can make huge differences in throughput.
  • One thing about iperf that always seemed "backwards" to me:  During a test, it is the iperf Client that sends data to the iperf Server.
    Below are stats from testing between two Win2012 Guests on the same ESXi 5.1 Host using different iperf TCP Window Sizes:
    ===============================================================
    server: iperf -w 64k -s
    client: iperf -w 64k -c x.x.x.x
    C:\Temp\iperf-2.0.5-cygwin> .\iperf.exe -c 192.168.120.20
    ------------------------------------------------------------
    Client connecting to 192.168.120.20, TCP port 5001
    TCP window size: 64.0 KByte (default)
    ------------------------------------------------------------
    [  3] local 192.168.120.22 port 65051 connected with 192.168.120.20 port 5001
    [ ID] Interval       Transfer     Bandwidth
    [  3]  0.0-10.2 sec   549 MBytes   451 Mbits/sec
    ===============================================================
    ===============================================================
    server: iperf -w 256k -s
    client: iperf -w 256k -c x.x.x.x
    PS C:\Temp\iperf-2.0.5-cygwin> .\iperf.exe -c 192.168.120.20 -w 256k
    ------------------------------------------------------------
    Client connecting to 192.168.120.20, TCP port 5001
    TCP window size:  256 KByte
    ------------------------------------------------------------
    [  3] local 192.168.120.22 port 18217 connected with 192.168.120.20 port 5001
    [ ID] Interval       Transfer     Bandwidth
    [  3]  0.0-10.2 sec  7.96 GBytes  6.72 Gbits/sec
    PS C:\Temp\iperf-2.0.5-cygwin>
    ===============================================================

    [1] More info on UDP vs TCP iPerf testing:  http://serverfault.com/questions/354166/iperf-udp-test-show-only-50-of-bandwidth 

    Reply
    0 Kudos
    janusz_w
    Contributor
    Contributor

    Generally agree but:

    1. yes, 1 vCPU for tested vm each for overhead as the cause of the problem and check ESX Host CPU utilization – ok,

         but for fine tuning, not for results 5-10 times lower than expected.

    2. jumbo frames – ok, but my tests show that even without jumbo frames (sorry, now only on 1Gb interface)

        we can get transfers like 900Mb/s

    3. yes, agree, I use iperf only with TCP

    4. we can use iperf options for dual/trade communication tests ( -d –L 5001  or  -d –r –L 5001 ).

        In my tests I use only one way transfers and change parameters only on clieny side.

    Best Regards,

    Janusz

    Reply
    0 Kudos
    janusz_w
    Contributor
    Contributor

    Tests is only tests, so I decide to modify only BufferSize (for the beginning), on some production Windows2008R2 servers and...

    first signal from backup staff:

    why our LAN agent backup starts to work 3 times faster Smiley Happy

    (1 and 30 minutes to 30 minutes)

    3 times for the beginning...

    It works.

    So now TCP parameters tuning on Windows2008R2, then on RHEL.

    (little problem is, that on 2008R2 register there is no such a parameter,

    so you have to localize interface and add that parameter )

    Soon tests on 10Gb.

    But the question is:

    why we can't find such information on VMware paper like

    'Best Practices for Windows/Linux vm LAN tuning'  ???

    I think it will be a HIT .

    Janusz

    Reply
    0 Kudos
    MR-Z
    VMware Employee
    VMware Employee

    I ran into the same issue as well on ESXi 5.0 U2 + Redhat 6.x. RH 5.x works just fine, but RH6.x just stinks. VMware is aware of this. in the mean time here is the solution (the vmware KB on LRO is slightly off).

    Disable LRO - DocWiki

    or this:

    esxcli system settings advanced set --int-value 0 -o /Net/VmxnetSwLROSL

    esxcli system settings advanced set --int-value 0 -o /Net/Vmxnet3SwLRO

    esxcli system settings advanced set --int-value 0 -o /Net/Vmxnet3HwLRO

    esxcli system settings advanced set --int-value 0 -o /Net/Vmxnet2SwLRO

    esxcli system settings advanced set --int-value 0 -o /Net/Vmxnet2HwLRO

    Hope this ends the misery for all!

    Reply
    0 Kudos
    billybobSSD
    Contributor
    Contributor

    This actually made the traffic worse. Before it would hit around the 500 megabit ceiling and stay there, now the bit rate just jumps around wildly....

    Reply
    0 Kudos