VMware Cloud Community
oreeh
Immortal
Immortal

Severe network performance issues

This thread is a follow-up to the following threads since these seem to be related:

http://www.vmware.com/community/thread.jspa?threadID=74329

http://www.vmware.com/community/thread.jspa?threadID=75807

http://www.vmware.com/community/thread.jspa?threadID=77075

Description of the issues and "results" we have so far.

juchestyle and sbeaver saw a significant degradation of network throughput on 100 full virtual switches.

The transfer rate never stabilizes and there are significant peaks and valleys when a 650 meg iso file

gets transferred from a physical server to a vm.

Inspired from this I did some short testing with some strange results:

The transfer direction had a significant impact to the transfer speed.

Pushing files from VMs to physical servers was always faster (around 30%) than pulling files from servers.

The assumption that this is related to the behaviour of Windows servers was wrong, since this happened

regardless of the OS and protocol used.

Another interesting result from these tests: e1000 NICs always seem to be 10-20% faster than the vmxnet

and that there is a big difference in PKTTX/s with vmxnet and e1000.

After that acr discovered real bad transfer speeds in a Gigabit VM environment.

The max speed was 7-9 MB/s, even when using ESX internal vSwitches.

A copy from ESX to ESX reached 7-9 MB/s too.

The weird discovery in this scenario: when disabling CDROMs in the VMs the transfer speed goes up to 20 MB/s.

Any ideas regarding this?

I'll mark my question as answered and ask Daryll to lock the thread so we have everything in one thread.

0 Kudos
387 Replies
sbeaver
Leadership
Leadership

Transfers inside ESX are at bus speed when the VM's are attached to the same vSwitch

Steve Beaver
VMware Communities User Moderator
VMware vExpert 2009 - 2020
VMware NSX vExpert - 2019 - 2020
====
Co-Author of "VMware ESX Essentials in the Virtual Data Center"
(ISBN:1420070274) from Auerbach
Come check out my blog: [www.virtualizationpractice.com/blog|http://www.virtualizationpractice.com/blog/]
Come follow me on twitter http://www.twitter.com/sbeaver

**The Cloud is a journey, not a project.**
0 Kudos
oreeh
Immortal
Immortal

definitely wrong (not your statement but the docs)

6. FTP from BSD VM to BSD VM (internal vSwitch) (both vmxnet, tools installed)

6334.75 PKTTX/s 71.51 MbTX/s with 7.96 MB/s

this isn't bus speed - not even with an ISA bus

Message was edited by:

oreeh

0 Kudos
juchestyle
Commander
Commander

It seems from casual perception, that this is happening only on HP equipment. Or did some one else say that they could repeat this on other hardware?

Respectfully,

Kaizen!
0 Kudos
JonT
Enthusiast
Enthusiast

I have one IBM x3950 I can try this on, but will only be a VM to VM copy on the same host. I will test both same vSwitch and separate and post results in a little bit.

0 Kudos
oreeh
Immortal
Immortal

It seems from casual perception, that this is happening only on HP equipment.

yeah

Or did some one else say that they could repeat this on other hardware?

only HP mentioned in the threads so far

0 Kudos
oreeh
Immortal
Immortal

could you please check the following too and post the results?

download iperf http://dast.nlanr.net/Projects/Iperf/#download (available for almost every OS)

and run the following tests

on one VM start iperf in server mode (iperf -s)

on the other VM start iperf in test mode (iperf -c serverip)

on the VM were iperf server is running run iperf in test mode against 127.0.0.1

0 Kudos
juchestyle
Commander
Commander

Humm. Maybe HP and broadcom nics need to be off the HCL?

Respectfully,

Kaizen!
0 Kudos
JonT
Enthusiast
Enthusiast

I am setting this up now, but this host had no VM's on it so I am now cloning a few real quick. I will post the results from iperf as you have suggested once I get the VM's up and running. For my tests I have the following networking setup on this IBM host:

OB NIC1 - 100 Half (console and cluster port group)

OB NIC2 - 1000 Full ("VM Main Network" port group) currently auto-neg but will hard code this.

PCI NIC is dual port and they have a cross-over cable between them. I have a vSwitch setup on both of these NICs so I can rule out the possibility of there being a network switching issue.

0 Kudos
oreeh
Immortal
Immortal

Thanks for your efforts

0 Kudos
oreeh
Immortal
Immortal

HP - not sure, if this was a design flaw one should expect it to be only affecting one server model / series, but Xeons and Opterons are affected

Broadcom - yes, I had a lot of (non-ESX) trouble with these NICs in the past, that's why I did some tests using an old Intel DualPort NIC

0 Kudos
acr
Champion
Champion

Oliver i did exactly the same test today, using iperf and we got 220Mbits/s

using Real IP we got 60Mbits/s

Already did the tests on Physicals with very impressive results...

This is with two VM connected to the same vSwitch.. Nothing else running..

0 Kudos
acr
Champion
Champion

Steve, i so wish that was the case..

Ive done tests on 3 totally different ESX environments using both ESX3 and 3.0.1..

Always with very poor results..

0 Kudos
oreeh
Immortal
Immortal

To me it seams that the problem really isn't only network related, since running the test against localhost gives bad results too.

Localhost it not allowed to leave the system and therefore the complete traffic simply is a memcopy (which should be real fast).

Running the test against the real IP should nearly give the same results as localhost since this traffic also shouldn't leave the system.

Weird

0 Kudos
acr
Champion
Champion

Oliver of all the tests and results thats the one which surprises me..

But to add to the confussion if i now connect an iSCSI initiator to any of the VMs im testing using the same vSwitch which under perform in Network tests, then use IP to connect to a iSCSI LUN..

i can get upto 140MB transfer...

How come..?

0 Kudos
oreeh
Immortal
Immortal

only a thought:

could it be the case, that there's some kind of traffic shaping / priorisation inside ESX which is not documented ?

Message was edited by:

oreeh

forget about that - this would mean that ESX has some knowledge about layer 4 - layer 7 protocols and I really don't think that this is the case

0 Kudos
acr
Champion
Champion

Wish i knew, but it wouldnt supprise me..

Id really like to know if anyone out there is getting the sort of speeds i was expecting..

Ive now tried these tests on ESX3, ESX3.0.1 unpatched, ESX3.0.1 fully patched..

All with similar results, which are less than exceptable..

0 Kudos
oreeh
Immortal
Immortal

Tomorrow I'm trying to get an ESX 2.5 running and see what happens there

0 Kudos
JonT
Enthusiast
Enthusiast

Ok so the iperf results are in for my x3950:

Interval was 1-10 sec.

Test from one VM to the other on diff vSwitches(separat pNIC's with crossover cable) was 278MB transfered @ 233MB/s rate

Test from one VM to the other on the same vSwitch was 416MB transfered, 349MB/s

Testing from the VM running 'server' to itself on the loopback address 127.0.0.1 was 468MB transfered @ 393MB/s.

I am going to run these same tests on my various HP models and report back. I have G1 and G2 of the 585 and a few 385G1's.

Forgot to mention that the crossover link is 1GB, but set to auto-negotiate. I will run these tests one more time with hard set 1GB.

Message was edited by:

JonT

0 Kudos
oreeh
Immortal
Immortal

Interesting

what NICs does the x3950 use / did you use ?

what ESX version is running on top of the x3950 ?

Message was edited by:

oreeh

0 Kudos
acr
Champion
Champion

JonT, arnt the result from IPERF in MBits/s ..?

Because i too got 220 between VMs but it was 220MBits/s

I hope im wrong as those are the sort of speeds id like to achieve..

Oh and if it is in MB, can i buy your ESX Environment from you...??

Message was edited by:

acr

0 Kudos