I have an ESXi server with an Intel X520-DA2 10 gig adapter in it. It has an iSCSI data store connected over one port and VM traffic over the other port. The iSCSI speed just couldn't be better but the problem seems to be that none of my VM's will do over 300 megabits/sec. Their all using VMXNET3 adapters. I have gone so far as to hook the second 10gig port directly up to another standalone windows server to eliminate the network/switch as a bottleneck via an SFP+ cable, but am still limited to the ~300 megabit ceiling. Any clues to what could be causing this? Thanks in advance!
Is this of any help?
http://download.intel.com/support/network/sb/fedexcasestudyfinal.pdf
Problem solved!
Actually a bit embarrasing, since we all know troubleshooting should be on as clean installs as possible :smileyblush:
Symantec Endpoint Protection 12 was causing the problem. Even with Network Threat Protection disabled we still couldn't get more than around 500Mbit/s throughput.
Completely uninstalling SEP jumped the speeds to expected levels.
Having a similar issue, however we are experiencing sever performance degredation almost to a point where users cannot do their work as our server is terribly slow.
Our issues started when i upgraded to esxi 4.2 u2 and the virtual machines to version 7 with paravirtual scsi and vmxnet3.
heres a typical output of my scenario
[3932] 87.0-88.0 sec 20.4 MBytes 171 Mbits/sec
[3932] 88.0-89.0 sec 26.6 MBytes 223 Mbits/sec
[3932] 89.0-90.0 sec 8.87 MBytes 74.4 Mbits/sec
[3932] 90.0-91.0 sec 10.9 MBytes 91.8 Mbits/sec
[3932] 91.0-92.0 sec 15.5 MBytes 130 Mbits/sec
[3932] 92.0-93.0 sec 8.10 MBytes 68.0 Mbits/sec
[3932] 93.0-94.0 sec 3.53 MBytes 29.6 Mbits/sec
[3932] 94.0-95.0 sec 3.63 MBytes 30.4 Mbits/sec
[3932] 95.0-96.0 sec 3.64 MBytes 30.5 Mbits/sec
[3932] 96.0-97.0 sec 24.9 MBytes 209 Mbits/sec
[3932] 97.0-98.0 sec 36.2 MBytes 304 Mbits/sec
[3932] 98.0-99.0 sec 41.1 MBytes 345 Mbits/sec
[3932] 99.0-100.0 sec 37.9 MBytes 318 Mbits/sec.
Very erratic and jumping below 30Mbits/sec. Im in the process of removing antivirus software (sophos) to see if that resolves anything.
Slow VMXNET3 performance on 10gig connection (and 1Gb too)
Nobody else has had this issue? EVERYBODY HAS :smileyangry:
I do some transfer tests using iperf between two virtual W2008R2 machines.
(no file transfer tests, iperf tests only LAN - so no disk/raid/datastore… issue)
Virtual machine LAN drivers - VMXNET3 (W2008R2 report as 10GB drivers)
No any change of the default VMXNET3 adapter settings (default).
First machine on ESXi 4.0 farm (farm 4xIBM x3610 with 1Gb interfaces)
Second machine on ESXi 5.0 (farm 4xCISCO C260 with 1Gb interfaces)
Farms on the same localization (but 3 Cisco 1 Gb switches between)
Results ( in MBytes/s !!! 😞
Standard connection (no change in iperf TCP parameters - default)
------------------------------------------------------------
[164]local 172.19.220.xxx port 49353 connected with 172.19.230.xxx port 5001
[ ID] Interval Transfer Bandwidth
[164] 0.0- 1.0 sec 23.1 MBytes 23.1 MBytes/sec
[164] 1.0- 2.0 sec 22.0 MBytes 22.0 MBytes/sec
[164] 2.0- 3.0 sec 23.0 MBytes 23.0 MBytes/sec
[164] 3.0- 4.0 sec 22.0 MBytes 22.0 MBytes/sec
[164] 4.0- 5.0 sec 22.1 MBytes 22.1 MBytes/sec
[164] 5.0- 6.0 sec 22.4 MBytes 22.4 MBytes/sec
[164] 6.0- 7.0 sec 22.4 MBytes 22.4 MBytes/sec
[164] 7.0- 8.0 sec 22.1 MBytes 22.1 MBytes/sec
[164] 8.0- 9.0 sec 21.9 MBytes 21.9 MBytes/sec
[164] 9.0-10.0 sec 22.2 MBytes 22.2 MBytes/sec
[164] 0.0-10.0 sec 223 MBytes 22.3 MBytes/sec
very bed
Changed in iperf - TCPWindowsSize from def. 8kB to 56kB on Client side
------------------------------------------------------------
[164]local 172.19.220.xxx port 49356 connected with 172.19.230.xxx port 5001
[ ID] Interval Transfer Bandwidth
[164] 0.0- 1.0 sec 81.1 MBytes 81.1 MBytes/sec
[164] 1.0- 2.0 sec 79.0 MBytes 79.0 MBytes/sec
[164] 2.0- 3.0 sec 76.9 MBytes 76.9 MBytes/sec
[164] 3.0- 4.0 sec 83.7 MBytes 83.7 MBytes/sec
[164] 4.0- 5.0 sec 85.3 MBytes 85.3 MBytes/sec
[164] 5.0- 6.0 sec 79.2 MBytes 79.2 MBytes/sec
[164] 6.0- 7.0 sec 83.5 MBytes 83.5 MBytes/sec
[164] 7.0- 8.0 sec 79.4 MBytes 79.4 MBytes/sec
[164] 8.0- 9.0 sec 81.2 MBytes 81.2 MBytes/sec
[164] 9.0-10.0 sec 77.9 MBytes 77.9 MBytes/sec
[164] 0.0-10.0 sec 807 MBytes 80.5 MBytes/sec
Much better (up to 4 times ?)
Changed in iperf TCP Buffer Length to 2MB on Client side
------------------------------------------------------------
[164]local 172.19.220.xxx port 49363 connected with 172.19.230.xxx port 5001
[ ID] Interval Transfer Bandwidth
[164] 0.0- 1.0 sec 106 MBytes 106.0 MBytes/sec
[164] 1.0- 2.0 sec 96.0 MBytes 96.0 MBytes/sec
[164] 2.0- 3.0 sec 100 MBytes 100.0 MBytes/sec
[164] 3.0- 4.0 sec 100 MBytes 100.0 MBytes/sec
[164] 4.0- 5.0 sec 96.0 MBytes 96.0 MBytes/sec
[164] 5.0- 6.0 sec 68.0 MBytes 68.0 MBytes/sec
[164] 6.0- 7.0 sec 88.0 MBytes 88.0 MBytes/sec
[164] 7.0- 8.0 sec 92.0 MBytes 92.0 MBytes/sec
[164] 8.0- 9.0 sec 88.0 MBytes 88.0 MBytes/sec
[164] 9.0-10.0 sec 64.0 MBytes 64.0 MBytes/sec
[164] 0.0-10.0 sec 900 MBytes 89.7 MBytes/sec
90MBytes/s on 1Gb interface – 900Mbytes/s - what else we need :)))
And now all together…
Changed in iperf TCP TCPWindowsSize to 56k and Buffer Length to 2M
------------------------------------------------------------
[164]local 172.19.220.xxx port 49365 connected with 172.19.230.xxx port 5001
[ ID] Interval Transfer Bandwidth [164] 0.0- 1.0 sec 92.0 MBytes 92.0 MBytes/sec
[164] 1.0- 2.0 sec 94.0 MBytes 94.0 MBytes/sec
[164] 2.0- 3.0 sec 82.0 MBytes 82.0 MBytes/sec
[164] 3.0- 4.0 sec 104 MBytes 104.0 MBytes/sec
[164] 4.0- 5.0 sec 98.0 MBytes 98.0 MBytes/sec
[164] 5.0- 6.0 sec 96.0 MBytes 96.0 MBytes/sec
[164] 6.0- 7.0 sec 90.0 MBytes 90.0 MBytes/sec
[164] 7.0- 8.0 sec 98.0 MBytes 98.0 MBytes/sec
[164] 8.0- 9.0 sec 94.0 MBytes 94.0 MBytes/sec
[164] 9.0-10.0 sec 104 MBytes 104.0 MBytes/sec
[164] 0.0-10.0 sec 954 MBytes 95.1 MBytes/sec
:))))))
No ESXi tuning !
Iperf change TCP parameters only ‘insaid‘ virtual machine !!
So now VMware staff !!!
Prepare 'LAN Best Practices' where to put this parameters
on W2008R2/ W2012/ RHEL... for VMXNET3, E1000.... (register, NIC driver param...)
Perhaps it help
Janusz
PS: sorry for my English
[1] More info on UDP vs TCP iPerf testing: http://serverfault.com/questions/354166/iperf-udp-test-show-only-50-of-bandwidth
Generally agree but:
1. yes, 1 vCPU for tested vm each for overhead as the cause of the problem and check ESX Host CPU utilization – ok,
but for fine tuning, not for results 5-10 times lower than expected.
2. jumbo frames – ok, but my tests show that even without jumbo frames (sorry, now only on 1Gb interface)
we can get transfers like 900Mb/s
3. yes, agree, I use iperf only with TCP
4. we can use iperf options for dual/trade communication tests ( -d –L 5001 or -d –r –L 5001 ).
In my tests I use only one way transfers and change parameters only on clieny side.
Best Regards,
Janusz
Tests is only tests, so I decide to modify only BufferSize (for the beginning), on some production Windows2008R2 servers and...
first signal from backup staff:
why our LAN agent backup starts to work 3 times faster
(1 and 30 minutes to 30 minutes)
3 times for the beginning...
It works.
So now TCP parameters tuning on Windows2008R2, then on RHEL.
(little problem is, that on 2008R2 register there is no such a parameter,
so you have to localize interface and add that parameter )
Soon tests on 10Gb.
But the question is:
why we can't find such information on VMware paper like
'Best Practices for Windows/Linux vm LAN tuning' ???
I think it will be a HIT .
Janusz
I ran into the same issue as well on ESXi 5.0 U2 + Redhat 6.x. RH 5.x works just fine, but RH6.x just stinks. VMware is aware of this. in the mean time here is the solution (the vmware KB on LRO is slightly off).
or this:
esxcli system settings advanced set --int-value 0 -o /Net/VmxnetSwLROSL
esxcli system settings advanced set --int-value 0 -o /Net/Vmxnet3SwLRO
esxcli system settings advanced set --int-value 0 -o /Net/Vmxnet3HwLRO
esxcli system settings advanced set --int-value 0 -o /Net/Vmxnet2SwLRO
esxcli system settings advanced set --int-value 0 -o /Net/Vmxnet2HwLRO
Hope this ends the misery for all!
This actually made the traffic worse. Before it would hit around the 500 megabit ceiling and stay there, now the bit rate just jumps around wildly....