VMware Cloud Community
gadina
Contributor
Contributor

Slow network performance between VM's on ESXi 5.5 U1 with Intel nic

Hello all.

There is many stories kind of my but I suppose I have some updates in this issue.

So, I have on ESXI 5.5 1881737  on Dell Server with Xeon L5520 and 2 Intel NIC on 82576 chip

Also I have old ESXi 5.0.0 623860 with Realtek 100Mb NIC

On the ESXi 5.5 is two VM's - Centos 6.5 with VMXNET3 nic - g1 and g2

And here we go:

[root@g1 ~]# iperf -c g2

------------------------------------------------------------

Client connecting to g2, TCP port 5001

TCP window size: 19.3 KByte (default)

------------------------------------------------------------

[  3] local 192.168.*.* port 51461 connected with 192.168.*.* port 5001

[ ID] Interval       Transfer     Bandwidth

[  3]  0.0-10.0 sec  21.2 GBytes  18.2 Gbits/sec

Then I make clone of these VM's to ESXi 5.0 and run test:

[root@g1copy ~]# iperf -c g2copy

------------------------------------------------------------

Client connecting to g2copy, TCP port 5001

TCP window size: 19.3 KByte (default)

------------------------------------------------------------

[  3] local 192.168.*.* port 51588 connected with 192.168.*.* port 5001

[ ID] Interval       Transfer     Bandwidth

[  3]  0.0-10.0 sec  34.0 GBytes  29.2 Gbits/sec

!!!

[  3]  0.0-10.0 sec  21.2 GBytes  18.2 Gbits/sec on ESXi 5.5

VS

[  3]  0.0-10.0 sec  34.0 GBytes  29.2 Gbits/sec on ESXi 5.0

!!!

I've many read about this igb issue and make some changes on ESXi 5.5

According to

http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=201889...

I've changed InterruptThrottleRate:

~ # esxcfg-module -g igb

igb enabled = 1 options = 'InterruptThrottleRate=8000,8000,8000'

and according to

VMware KB: Understanding TCP Segmentation Offload (TSO) and Large Receive Offload (LRO) in a VMware environment

I've changed to 0 (disabled):

~ # esxcli system settings advanced set --int-value 0 -o /Net/VmxnetSwLROSL

~ # esxcli system settings advanced set --int-value 0 -o /Net/Vmxnet3SwLRO

~ # esxcli system settings advanced set --int-value 0 -o /Net/Vmxnet3HwLRO

~ # esxcli system settings advanced set --int-value 0 -o /Net/Vmxnet2SwLRO

~ # esxcli system settings advanced set --int-value 0 -o /Net/Vmxnet2HwLRO

Also I've found newest driver for igb - newer than the one that was in ESXi 5.5

https://my.vmware.com/web/vmware/details?downloadGroup=DT-ESXI55-INTEL-IGB-525&productId=353

before update driver:

~ #  esxcli software vib list | grep igb

net-igb                        5.0.5.1.1-1vmw.550.1.15.1623387        VMware    VMwareCertified   2014-07-09

~ #

after update driver:

~ # esxcli software vib list | grep igb

net-igb                        5.2.5-1OEM.550.0.0.1331820             Intel     VMwareCertified   2014-07-09

unfortunately no one of these modifications didn't help.

I'm disappointed.

0 Kudos
4 Replies
MKguy
Virtuoso
Virtuoso

From what I understood you do this iperf test with both VMs on the same host and port group, right?

In that case the network drivers of the ESXi host are irrelevant as traffic is forwarded internally.

So, I have on ESXI 5.5 1881737  on Dell Server with Xeon L5520 and 2 Intel NIC on 82576 chip

Also I have old ESXi 5.0.0 623860 with Realtek 100Mb NIC

What hardware (CPU) do you have on your 5.0 host? This kind of test will be typically be limited by the CPU performance a single physical core is able to deliver.

-- http://alpacapowered.wordpress.com
0 Kudos
gadina
Contributor
Contributor

>From what I understood you do this iperf test with both VMs on the same host and port group, right?

Yes !

Even on vSwitch without NIC - the same results.

>In that case the network drivers of the ESXi host are irrelevant as traffic is forwarded internally.

I wouldn't agree with you.

I can't explain exactly why I think in such way, but there so many questions in Internet regarding this issue and all of they the same configuration - slow network between two VM's on one vSwitch only when there is NIC from Intel

So physical NIC may be relevant to this issue.

>What hardware (CPU) do you have on your 5.0 host?

4 CPUs x 3,501 GHz

i7-3770

ESXi 5.5

8 CPUs x 2,266 GHz

Xeon L5520

0 Kudos
JPM300
Commander
Commander

Have you had a quick look through this document to see if there is anything that could assist:

http://www.vmware.com/files/pdf/support/landing_pages/Virtual-Support-Day-Best-Practices-Virtual-Net...

Also is there any possibility that TOE might be enabled on the Intel NIC's or the VM's might have it enabled:

http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=102406...

0 Kudos
MKguy
Virtuoso
Virtuoso

So physical NIC may be relevant to this issue.

I have to disagree with you here. The fact that your various adjustments apparently had 0 impact also hints to that conclusion.

>What hardware (CPU) do you have on your 5.0 host?

4 CPUs x 3,501 GHz

i7-3770

ESXi 5.5

8 CPUs x 2,266 GHz

Xeon L5520

And here is the explanation. As I mentioned, on the same host the iperf test is usually limited by CPU (and probably memory bus) performance. It's all about processing as many packets per second as possible, which costs CPU.

Your 5.0 host has a CPU of a very recent generation compared to the 6 year old Nehalem-based Xeon, plus a much higher clock (and memory bus) rate.

Comparing it in the context of clock rate alone already shows the connection: The CPU clock rate is 55%, and network throughput is 60% higher. (Also consider newer instruction sets and general internal efficiency improvements of recent CPU generations etc).

With this hardware you are basically comparing apples to oranges and the performance difference is not strange.

-- http://alpacapowered.wordpress.com
0 Kudos