Severe network performance issues

oreeh · ‎03-21-2007

This thread is a follow-up to the following threads since these seem to be related:

http://www.vmware.com/community/thread.jspa?threadID=74329

http://www.vmware.com/community/thread.jspa?threadID=75807

http://www.vmware.com/community/thread.jspa?threadID=77075

Description of the issues and "results" we have so far.

juchestyle and sbeaver saw a significant degradation of network throughput on 100 full virtual switches.

The transfer rate never stabilizes and there are significant peaks and valleys when a 650 meg iso file

gets transferred from a physical server to a vm.

Inspired from this I did some short testing with some strange results:

The transfer direction had a significant impact to the transfer speed.

Pushing files from VMs to physical servers was always faster (around 30%) than pulling files from servers.

The assumption that this is related to the behaviour of Windows servers was wrong, since this happened

regardless of the OS and protocol used.

Another interesting result from these tests: e1000 NICs always seem to be 10-20% faster than the vmxnet

and that there is a big difference in PKTTX/s with vmxnet and e1000.

After that acr discovered real bad transfer speeds in a Gigabit VM environment.

The max speed was 7-9 MB/s, even when using ESX internal vSwitches.

A copy from ESX to ESX reached 7-9 MB/s too.

The weird discovery in this scenario: when disabling CDROMs in the VMs the transfer speed goes up to 20 MB/s.

Any ideas regarding this?

I'll mark my question as answered and ask Daryll to lock the thread so we have everything in one thread.

oreeh · ‎04-16-2007

There's thread covering the UDP issue http://www.vmware.com/community/thread.jspa?messageID=540298

There's even a solution to it

postfixreload · ‎04-16-2007

I don't want to believe that it will have an effect;
however, I know that sometimes network performance
improves when you don't have a cd rom connected!

The CD-rom will only affect Windows VM. Windows will keep checking the persentation of the cd-rom. If you put a media inside ESX server you may even lost packets from or to the VM. The reason is if the host or any other VM is holding the cd-rom, and your windows vm try to read it, it need to wait, and network for vm on another hand also use the CPU cycle so it will also wait until the cd-rom check finished.

postfixreload · ‎04-16-2007

Hi Oliver,

Sorry to ask this... are we trying to find why transfer rate is slower from vm to vm then vm to host, or over all network performance when copy a file? And what we are expecting here?

thanks

oreeh · ‎04-16-2007

Sorry to ask this...

there's nothing to be sorry of...

We are trying to find why network performance in general(?) is slow.

We only use VM to VM / VM to Localhost since it's easier to test and at the beginning discovered

that there's a relation between the throughputs.

And what we are expecting here?

I'd say at least 70-90% percent of the performance in the physical world.

VMware somewhere states that the performance drawback with ESX is about 10%.

Now if I have a VM I expect it to reach a reasonable network throughput.

Particularly if a VM runs on top of a DL385G2 with 2 dual core CPUs and gigabit network I

expect it to be as fast as a physical machine running on a PIII-800 with a 100M network (using the same setup).

juchestyle · ‎04-16-2007

Overall network performance.

It seems like we are not seeing what we want to see. At our organization we lost two vm's that moved back to physical because we couldn't get the speed.

ah, late for a meeting!

Kaizen!

postfixreload · ‎04-16-2007

I was talking to someone on this issue, interesting direction to look at this is the block size on VMFS. as we all know the default size is 8MB, and what about create a vmfs with 1MB block size? I will see if I can find something to test on. If anyone else can do this please share the results

CWedge · ‎04-16-2007

I was talking to someone on this issue, interesting
direction to look at this is the block size on VMFS.
as we all know the default size is 8MB, and what
about create a vmfs with 1MB block size? I will see
if I can find something to test on. If anyone else
can do this please share the results

As stated earilier in this posting..

The testing we are doing does not write to disk, alteast I don't see it writing..

So block size wouldn't make that much of a difference.

If you do look though, there is a discussion on disk subsystems.

CWedge · ‎04-16-2007

There's thread covering the UDP issue
http://www.vmware.com/community/thread.jspa?messageID=
540298
There's even a solution to it

Well the solution is for esx 3.0.1 with no patches...Unfortunatley i have all of them.

maybe that's it?

I'm thinking about trying the e1000 idea..

oreeh · ‎04-16-2007

Sounds weird - but I'll try it.

oreeh · ‎04-16-2007

this should help a bit

oreeh · ‎04-17-2007

just tried the VMFS block size hint - absolutely no difference (as expected)

postfixreload · ‎04-20-2007

Hi Oliver,

I will reopen ticket 387278 for you and I will take over the ticket as well. I'm getting all the infomation that I need to file a problem report on this issue. Please upload your latest vm-support log files to the ftp. However, I will be in training for the coming week, it may slow down everything a bit, but I will look into the log files and post anything that I find there.

Thanks

Johnny

acr · ‎04-20-2007

Are these Tickets worth anything..?? How many need opening..?

postfixreload · ‎04-20-2007

Hi acr,

If I remember right, personally I only helped on one ticket 374322. There was some problem on the SAN side. I can't say that is the same issue with everyone else here. As you know performance issue could cause by many different parts.It could cause by the NIC, bad driver, physical resource, storage, or even a busy network. The problem report we file with engineering need to be specific. That is why we need to look into each ticket to find out what the problem is or at least a direction for engineering to look at. If you do have this problem, please feel free to file a ticket. Tell the TSE to contact Johnny Zhang in Burlington. I'm collecting all the information before the problem report can be filed

thanks

Johnny

oreeh · ‎04-20-2007

Hi Johnny,

I will reopen ticket 387278 for you and I will take over the ticket as well.

Thanks

Please upload your latest vm-support log files to the ftp.

the files are on the way

Oliver

oreeh · ‎04-20-2007

I'll post any news, insights, ... to this thread.

It seems to me - since Johnny joined in on this - that our begging was heard.

angsana · ‎04-21-2007

Hello Oreech, JonT and others,

Thank you for making all the measurements.

May I ask a few clarifying question with respect to configuration of various tests?

In particular, I'd like to seek some clarification with respect to the VM-to-VM networking tests that are reported in this thread. I think the term VM-to-VM may refer to any of the following situations, which actually turn out to take very different data paths:

(1) VM-to-VM on one vswitch

(2) VM-to-VM on two different ESX boxes

(3) VM-to-VM on two different vswitches on the same ESX box

For completeness, the other configuration that may be of interest is:

(4) VM-to non-virtualized system (sometimes referred to as "native" as opposed to virtualized installation).

And not to forget the "local loopback via 127.0.0.1" that this thread has already discussed.

(5) Local loop-back within an OS (VM or native).

The differences between these configurations are as follows:

In configuration (1), no data goes onto networking wire. The performance here is highly dependent on the server hardware: its CPU and memory subsystem. Newer servers are clearly better than older ones, and sometimes quite markedly so. The type of virtual NIC can affect performance. Of the 3 types of virtual NIC on ESX, vlance is typically the slowest. vmxnet is generally slightly faster than e1000 vNIC from my experience (which runs contrary to Oreeh's), but the gap is not as large as compared to vlance.

- Digression: on ESX3.0.0 and ESX3.0.1, there is a known UDP tx performance problem that has been mentioned before, for which the workaround is to set Ethernet.features=0. This bug has not been taken care of by any of the patches yet. For those who are interested, this performance problem is a combination of how Windows networking stack deals with UDP, and its interaction with vmxnet. You won't see it for a Linux VM for example.

- With ESX3.0.0/3.0.1, vlance/vmxnet are combined into a single device with no personalities. Installation of vmware-tools should get make such a virtual NIC operate in vmxnet mode, but earlier on, there was a bug where this failed to work for some Linux 2.6.x guest OSes. The result was that such virtual NICs will operate only as vlance, at much lower performance than vmxnet. This bug was fixed in a patch a few months ago. For the patch to be effective, vmware tool has to be re-installed in affected VMs after the patch has been applied.

In configuration (2), data clearly hits physical NIC and external network wire and switches. Here, one thing to isolate, in performance runs, if possible is any congestion in the external network. If possible, please use either direct-cabling (cross-over for 10/100BaseT; 1000BaseT can work with normal cable), or an isolated switch.

- If 10/100BaseT switches are involved in ESX setups, please make sure these are switches and not hubs. ESX networking puts physical NIC into promiscuous mode. With external switches, the external switch will filter out irrelevant unicast traffic, only forwarding relevant unicast traffic to the physical NICs in ESX. If hubs are used however, unicast traffic is delivered to all switch ports. This means that irrelevant unicast traffic is only filtered out after they are received by ESX networking software. Clearly if there is other traffic on the hub, this is going to result in CPU overhead. Functionally, things will work. Performance is the one that is impacted.

- In configuration (2), it is possible that different physical NIC and/or even driver version may have an effect on performance.

Configuration (3) is similar to (2) in that network traffic traverses physical NIC and external network wire/switches. It is actually more demanding than configuration (2) in that both transmit and receive processing is on the same server. If the server is small, say with only a total of 2 CPU cores (e.g. 2 socket with single core processors), performance of (3) may be lower than for (2). If there is ample processing power (e.g. 4 or 8 processor cores on a new server), the impact is probably not as big. Please note that configuration (3) is no-where similar to configuration (1), even though at a high level, it is VM-to-VM traffic on the same ESX server.

Configuration (4) rounds out the picture, and is also useful for a controlled way to trouble-shoot any network performance problem. For example, transmit perforamance from VM can be measured without coupling it to VM receive performance limit, by using a "native" system as receiver. Similar, receive into VM can be measured without coupling it to VM transmit performance limit.

Configuration (5) does not involve any NIC, whether virtual or physical. The loop-back traffic simply loops back at the bottom of the OS (GOS)'s network stack, without touching any NIC.

In general, it is also important to take into account the OS, whether guest OS in a VM, or OS on a native system. Different OSes have different networking characteristics. The above mentioned UDP tx issue is an example. And of course different OS on the same physical or virtual hardware can have different performance.

Another thing to note is the socket size. Virtualized networking typically has longer latency. With TCP, this is not a problem as long as a reasonable socket size is used. So with network micro-benchmarking tools, it is important to note the socket size used and report that as part of performance result. Of course to get higher throughput, a larger socket size should be used.

So now back to my questions :).

(Q1) Are the VM-VM tests reported here configuration (1)? (especially for the large suit of performance results obtained by JonT and Oreech. -- thanks to both of you for doing so much measurements.)

(Q2) What is the setup where you are having the most performance problem(s)? (Which configuration as defined above, and what hardware: server PCPU # and type, amount of physical memory, physical NIC, what kind of slot is the physical NIC plugged into, i.e. what type of PCI, speed, width; VM configuration: # VCPU, amount of memory, # and type of vNIC, what virtual devices.)

As reference, I'm more used to running the "netperf" network micro-benchmark (instead of iPerf). With VM on ESX3.0.0/ESX3.0.1, when run with something like:

netperf -H <receiver-host-name> -l 30 -t TCP_STREAM -- -m 8192 -M 8192 -s 65536 -S 65536

I have no problem reaching > 400 or 500Mbps (depending on Guest OS type) even on very old hardware (e.g. Dell 1600), and has no problem reaching GigE wire speed (~930Mbps) on new servers (Intel Woodcrest processors, or recent Opterons).

Thanks everyone.

\- Boon

oreeh · ‎04-22-2007

So now back to my questions :).
(Q1) Are the VM-VM tests reported here configuration (1)?
(especially for the large suit of performance results obtained by JonT and Oreech. -- thanks to both of you for doing so much measurements.)

In my tests VM to VM means same vSwitch except when stated otherwise

(Q2) What is the setup where you are having the most performance problem(s)?

(Which configuration as defined above, and what hardware: server PCPU # and type, amount of physical memory, physical NIC,
what kind of slot is the physical NIC plugged into, i.e. what type of PCI, speed, width;
VM configuration: # VCPU, amount of memory, # and type of vNIC, what virtual devices.)

VM to VM same vSwitch (1), different vSwitch (3), VM to non-virtualized (4)

My ESX test boxes are

\- DL380G3, 2*Xeon 3GHz, 8GB RAM, local storage, pNIC Intel Pila8472 (100M environment)

\- DL385G2, 2*AMD DC 2.6GHz, 8GB RAM, local and SAN storage, HP NC364T PCI-e (100M environment)

\- DL385G2, 2*AMD DC 2.6GHz, 8GB RAM, local and SAN storage, HP NC364T PCI-e (1000M environment)

\- DL585G2, 2*AMD DC 2.6GHz, 16GB RAM, local and SAN storage, HP NC340T PCI-X, NC364T PCI-e (100M environment)

\- DL585G2, 2*AMD DC 2.6GHz, 16GB RAM, local and SAN storage, HP NC340T PCI-X, NC364T PCI-e (1000M environment)

Switch used: HP5308XL, latest firmware

VMs: always FreeBSD 6.2, 1 vCPU, 512MB RAM, VMware tools installed, e1000 NIC

TCP window size in all tests: 64K (FreeBSD default size).

Window size in Windows adjusted to 64K since Windows default is 8K.

No reservations, shares and limits (besides the defaults).

Last test results:

\- VM1 to VM2 (same vSwitch), DL385G2, SAN, internal vSwitch (no pNIC attached): 382 Mbits/sec

\- VM1 to VM2 (same vSwitch), DL385G2, SAN, 100M pNIC attached: 419 Mbits/sec

\- VM1 to VM2 (same vSwitch), DL385G2, SAN, 1000M pNIC attached: 453 Mbits/sec

\- VM1 to VM2 (different vSwitch), DL385G2, SAN, 100M pNICs attached: 94,5 Mbits/sec

\- VM1 to VM2 (different vSwitch), DL385G2, SAN, 1000M pNICs attached: 401 Mbits/sec

\- VM1 to VM1 (IP of VM), DL385G2, SAN, internal vSwitch (no pNIC): 321 Mbits/sec

\- VM1 to VM1 (IP of VM), DL385G2, SAN, vSwitch with 100m pNIC: 358 Mbits/sec

\- VM1 to VM1 (IP of VM), DL385G2, SAN, vSwitch with 1000M pNIC: 324 Mbits/sec

\- VM1 to VM1 (127.0.0.1), DL385G2, SAN, internal vSwitch (no pNIC): 1.05 Gbits/sec

\- VM1 to VM1 (127.0.0.1), DL385G2, SAN, vSwitch with 100M pNIC: 1.03 Gbits/sec

\- VM1 to VM1 (127.0.0.1), DL385G2, SAN, vSwitch with 1000M pNIC: 1.06 Gbits/sec

non-virtualized system (Windows Server, Dual Xeon 2.4 GHz, ML530G2, 2GB RAM, NC6136 NIC) to VM1 1000M: 244 Mbits/sec

VM1 to non-virtualized system (same as above), 1000M: 367 Mbits/sec

non-virtualized (ML530 G2) to non-virtualized (DL380G4) 1000M: 842 Mbits/sec

In configuration (1), no data goes onto networking wire. The performance here is highly dependent on the server hardware: its CPU and memory subsystem.

If this is true then why there's a difference in the above test results?

To me it seems that the dependence on CPU and memory only comes to play when running the localhost test.

Oliver

oreeh · ‎04-25-2007

Very interesting http://www.vmware.com/community/thread.jspa?threadID=81978

JonT · ‎04-25-2007

Thanks for the input and questions on this thread. Most of my tests were done as VM-to-VM on the same host and vswitch, but not all. When possible I tested VM-to-VM on different hosts and vswitches, but the results didn't seem to change much.

My best example to point out our real difficulties here would be the testing I conducted on the IBM x3950 platform:

IBM x3950 - 2 types of NIC's

\- Onboard Broadcom NIC's

VM to VM

64k: 670 MBytes 561 Mbits/s

VM on Localhost

64k: 729 MBytes 611 Mbits/s

\- NetXtreme II BCM5706

VM to VM

64k: 848 MBytes 711 Mbits/s

VM to Localhost

64k: 803 MBytes 674 Mbits/s

I ran a lot of other tests but basically we shouldn't see such performance when they only go out to the pSwitch and back. The speed that is shown for the same vSwitch makes more sense when you think of it as a simple disk transaction.

I edited this portion due to posting the wrong results for my "complaint". My 385 G1 tests weren't bad but not stellar compared to the G2.

Message was edited by:

JonT

All