Re: Terrible guest network performance!

NathanA20111014 · ‎06-24-2011

Okay, I've been banging my head against this problem for a couple of days, and I'm almost convinced that there is some kind of artificial restriction in place in ESXi that is preventing network *transmission* speeds (uploads *from* guests) from being able to reach peak performance.

I have a Dell PowerEdge 2950 running ESXi 4.1u1. I've also tried 4.0...same story. I have it plugged into a gigabit switch. On the same switching infrastructure, I have a MikroTik RouterBoard 1000 (PowerPC-based) running RouterOS 4.x (based off of Linux kernel 2.6.x). RouterOS has a built-in bandwidth test (network saturation) tool, and they have also released a free Windows bandwidth test tool that can interact with their proprietary "MikroTik bandwidth test protocol".

If I initiate a TCP download test from the RouterBoard to a Windows Server 2003 guest on my Dell 2950 running ESXi 4.1, I see 200-300Mbit/s. An upload test from the guest to the RouterBoard shows a pretty consistent 100Mbit/s. (The Win2003 guest was the only guest running on the 2950 at the time.)

So the question I had was, what is causing both the download and upload speeds to be so poor on an all-gigabit switched network?

Next I tried UDP from RouterBoard to ESXi Win2003 guest. 900Mbit/s!! Fantastic! However, a UDP test in the opposite direction -- from ESXi guest to RouterBoard -- is a very consistent 100Mbit/s.

So traffic from the outside TO an ESXi guest performs admirably. The reason that the TCP test was 33% of the performance of the UDP test can be explained by the slow upload speed (TCP ACK transmissions from the ESXi guest to the RouterBoard during the TCP download test).

But traffic FROM the ESXi guest to the outside is only able to achieve 100Mbit/s, 1/10th of the potential of the network it is connected to.

I have performed these other tests to eliminate potential variables:

1) Instead of using the gigabit Broadcom ethernet interfaces built on to the Dell 2950 mainboard, I added an Intel E1000 PCIe card to the ESXi host, and used that instead. Performance was exactly the same, so the problem was not the Broadcom chipset.

2) Instead of emulating the E1000 adapter in the guest, I switched to using VMXNET3. Same exact performance.

3) MikroTik RouterOS, which is Linux at its heart, has an x86 version, so I ran the x86 RouterOS as a guest on the 2950 under 4.1u1, and conducted bandwidth tests between it and the RouterBoard 1000. Performance was EXACTLY the same as it was in the Windows guest (900Mbit/s UDP download, 100Mbit/s UDP upload), so the problem was not with Windows or with how Windows specifically performs as a guest under ESXi 4.1.

4) I made sure that the hardware switching infrastructure was good by running bandwidth tests between an x86 box with a gigabit ethernet port that WASN'T running ESXi and the RouterBoard 1000, and performance was fantastic in BOTH directions. So I know the problem is not between the switch and the Dell 2950, or between switches.

5) I also proved the problem wasn't with the switches by running bandwidth tests between the RouterOS (Linux) guest and the Windows 2003 Server guest RUNNING ON THE SAME HOST, and neither one could achieve more than 100Mbit/s of throughput to the other guest running on the same machine! The traffic wasn't even touching my switch OR the physical ethernet adapter!

So, in conclusion, it doesn't matter what operating system I am running as a guest under ESXi, and the network traffic doesn't even have to leave the host for the problem to be evident! This can only lead me to conclude that ESXi is, for some (stupid?) reason, artificially capping the transmit throughput of all guests running on it.

The questions are obvious:

1) WHY?!?

2) What can I do to turn this restriction off?

What am I missing? Surely I can't be the only one suffering from this, or the first person to notice or run across this problem. But if someone else HAS run across this before, I can't find anybody talking about it...I have searched high and low.

I'm tearing my hair out. Please help.

Thanks,

-- Nathan

AndreTheGiant · ‎06-25-2011

Have you tried also with other virtual NIC type?

Do you have dedicated physical NIC for VM networking?

Have you tried to use not the integrated Broadcom NIC but instead an Intel card?

Andre

Andrew | http://about.me/amauro | http://vinfrastructure.it/ | @Andrea_Mauro

NathanA20111014 · ‎06-25-2011

Andre,

Thanks for the reply, but if you had actually read my entire message, you would have known that I already tried all of those things: 1) tried VMXNET3 instead of E1000; 2) yes, I've tried a dedicated card (didn't explicitly state this); 3) I installed an E1000 PCIe card and used instead of the Broadcom.

In fact, if you had actually read the post, you would have learned that the problem isn't the physical network or the physical network card because the problem happens between VM guests (guest-to-guest on the same host).

-- Nathan

Rumple · ‎06-25-2011

First off..leave your attitude at the door..its not appreciated, nor is it gaining you any respect or brownie points.

Now, onto the issue at hand:

might be worth trying out some of these settings:

http://www.speedguide.net/articles/windows-2003-tcpip-parameters-2665

I think there are options to disable the checksum and maybe tcp/ip offload, but I can't remember off the top of my head.

Remember however that virtualzation is designed to balance all resource utilziation without allowing a single guest to maximize all the rousources, so at the cruxt of it, thats probably why you can't peg the VM.

Out of curiosity, have you tried running against 2 VM's from an external host ast the same time just to see if you get same performance out of both (ie, basically giving you outbound of 200mbps.

NathanA20111014 · ‎06-26-2011

You're right, and I apologize. I guess I was just frustrated that I had spent all of that time trying to diagnose the problem and then went to the effort of documenting all of my efforts in detail, only to have it all ignored in the response. I think the cynical side of me saw it as an attempt by someone to get some community points on the cheap...

Anyway, I'll gather the results of that command and try the suggestions you linked to as soon as I can, which will most likely be when I am next in the office (tomorrow). Also, running two simultaneous tests from two guests to two outside hosts is an interesting thought and something I had not yet considered...I will definitely give that a go and report the results back. Thank you.

I, too, have my doubts that if you are looking for ethernet link autonegotiation results in the output of the esxcfg-nics command that you will find anything; as you said, it seems highly unlikely that both NICs will have mis-negotiated. Even if it did reveal something, I doubt it would fully explain the problem since I see the problem guest-to-guest in the same host. I'm willing to bet that if I created a new vSwitch that had no physical NICs assigned to it and then attached both guests to that vSwitch and ran tests between them, I would still see the problem. (In fact, I'll give that a go as well.)

Also, I don't think those Windows TCP/IP registry tweaks you linked to will prove to be effective in this case. Most of them seem to deal with TCP performance tuning, but as I mentioned in my initial post, I am seeing the guest upload slowdown on unidirectional UDP traffic as well. Also, remember that I tried a non-Windows guest and it did the exact same thing, so the problem is not Windows (or guest-OS) specific.

Thanks,

-- Nathan

P.S. -- If it turns out that your theory is correct that the hypervisor is preventing any one guest from saturating the network resource, is there any way to adjust the threshold at which this restriction cuts in? I can understand the need for it to do so, but it cutting in at a mere 10% is ridiculous.

AureusStone · ‎06-26-2011

Does the bandwidth test copy files? If that is the case there is also the possibility the poor performance is caused by your disk configuration instead of network.

Otherwise the only other thing I can suggest is to try VMDirectPath to add a NIC directly to a guest and test network performance from there. That way you will be using a physical NIC, instead of a virtual one. If performance is still bad, it means the issue is with your configuration and not the hypervisor.

NathanA20111014 · ‎06-26-2011

Thanks for the response. No, the bandwidth test utility I'm using does not involve disk I/O in any way...it either generates a stream of 0s or random data (so either highly compressible or highly uncompressible) depending on how it's configured.

I wanted to try VMDirectPath, but the Dell PowerEdge 2950 does not support VT-d, so that's unfortunately not an option. I do have a PowerEdge R410 that I have access to which does support VMDirectPath, so I may give that a try after confirming that the issue also exists on that host.

As far as the other tests are concerned, I couldn't stand waiting until tomorrow, so I VPN'd in and tried a couple of things.

1) First, I tried doing simultaneous bandwidth tests from two guests (Windows and Linux) to two external non-virtual hosts. I was able to sustain ~100Mbit/s UDP transmit on both hosts simultaneously for a total of ~200Mbit/s! So that would seem to indicate it is not the ethernet card or the ESXi physical card driver (which we already knew because of the guest-to-guest tests, but this just reaffirms that).

2) I then tried two test simultaneously from the same guest. Each test's throughput was halved (to approx. ~50Mbit/s). So the throughput is being limited to the entire guest.

3) I created another vSwitch with NO physical adapters bound to it, and moved my two guests over to that vSwitch. Neither guest had any contact now with the outside physical network; they could only talk to each other. However, they were still throughput-limited to each other.

Honestly, what it feels like to me is that Traffic Shaping is enabled on the vSwitch even though it isn't. Just for kicks, I tried enabling traffic shaping on the vSwitch and then putting ridiculously-high values in, but that didn't make any difference. If I lowered the numbers to 50Mbit/s, traffic shaping did kick in at that level, though.

Thanks again,

-- Nathan

NathanA20111014 · ‎06-27-2011

More info:

1) I was NOT able to test a VMDirectPath-attached network card on a guest running on the R410, sadly. The two onboard network interfaces on the R410 seem to be "bound" together somehow (sharing a PCI bridge chip maybe?) and ESXi warns me that I will lose management to the box if I try to make vmnic1 available for VMDirectPath because vmnic0 is dependent on it. Bleh. So I was going to throw an E1000 in a PCIe slot and try that, but there is only one space available on the R410, and it is already occupied by a PERC.

2) Interestingly, if I copy over the same Windows guest to the R410 that I have on the 2950 and try UDP upload tests with that through a vSwitch, I get 400Mbit/s! Fascinating.

3) I shut down ESXi on the 2950 (which is still in a non-production state), and booted a copy of MikroTik RouterOS (Linux) on it from a USB key, and ran the same UDP upload tests. I got 950Mbit/s. So, on the same hardware, as a guest of ESXi I get ~100Mbit/s, but running the same OS bare-metal outside of the hypervisor, I can nearly saturate the gigabit link. ESXi is definitely negatively affecting the performance.

Another curveball: this all started when I needed to convert this same 2950 from VMware Server running on top of Windows Server 2003 x64 to ESXi. I moved the guests off of VMware Server to an equivalent 2950 that was underutilized and put ESXi on the first 2950. I moved them over with VMware Converter running directly on the host OS (pointed it directly at the VMXes while the guests were shut down). The VMDKs moved from VMware Server to ESXi averaged 30-40MB/s. However, when I try to move them BACK, I can only achieve an average of 10MB/s, and it's going to take hours at that rate. I don't think it is a disk throughput issue on the target since it's getting written to a 15KRPM SAS drive and there are NO running guests on the target host when I attempt the migration...

Once I saw the UDP upload test results, I assumed that the slow migration speed was related to this and that the problem was with the upload from the box that currently has the guests on it. However, on a whim, I tried migrating the same guests over to the R410, and was seeing 30-40MB/s.

Just for grins, I also tried migrating the same guest from the 2950 to an old 2650 we've got laying around that's running ESXi 3.5u5. I averaged over 20MB/s to that host, double what I see to the other 2950.

So...I don't know anymore. There is obviously a transmission throughput problem on the 2950s as evidenced by the fact that an OS outside of ESXi can fully saturate the ethernet interface, but inside of ESXi on the exact same hardware I only see 1/10th of that. However, at the same time, I seemingly can do 300-400Mbit/s (which is excellent) from the 2950 to the R410 during a VMware Converter assisted migration, but only a quarter of that speed between 2950s, which would seem to indicate that there is also a RECEIVE problem on the target 2950, too. So I may be dealing with multiple issues here.

My head hurts.

-- Nathan

ats0401 · ‎06-28-2011

Wow, a lot of troubleshooting so far. Here is a couple more things you might try.

During the 100mb/s upload, check out esxtop and hit N and see if there is anything unsual in the statistics.

Specifically check for %DRPTX and %DRPRX to see if the host is dropping any packets.

You might try enabling remote tech support mode, and using WinSCP or some tool and try uploading and downloading a file from ESXi directly.

drop a large file on the mikrotik and SCP it to the /tmp directory on ESXi and vice cersa, see what kind of speed you get.

Just be careful not to fill up the / partition but you should be fine with a few hundred meg file (just delete it right afterwards)

another thing might be to load wireshark on guest OS and do a network capture during the upload.

Then you can inspect the actual traffic and maybe that will give you some clues.

Or, try a traffic capture from ESXi tech support mode console itself using tcpdump-uw

http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=103118...

One last thing, simple and stupid question, but I assume VMware tools is installed and up to date on the guest OS?

NathanA20111014 · ‎06-29-2011

Thanks for the reply. I wish I still had helpful points to award on this thread...esxtop was new to me. I do already have local and remote "tech support" (SSH server) mode enabled; it's one of the first things that gets enabled on a new ESXi install for me.

I think I can finally put this to rest. As I surmised in one of my last responses, there were, in fact, multiple problems. Some were just getting hidden behind others. To be specific and to reiterate, it turns out that I had three issues that I was tackling separately:

1) Upload throughput from Windows gusts.

2) Upload throughput from RouterOS guests.

3) VMware Converter transfer speed.

They all seemed to be related because the throughput numbers I was seeing for all three problems were roughly equal to each other, sitting at ~100Mbit/s. It turns out that they had no relation to each other at all.

I'll do these out-of-order:

#2 - I noticed that the version of RouterOS that I was running off of the USB key on the PowerEdge 2950 was a newer version than the one I was running virtualized. I upgraded the RouterOS guest to match, and...whoa: now I can push 950Mbit/s of outbound UDP through a vSwitch from RouterOS. Okay, so the Windows throughput issue and the RouterOS throughput issue were *unrelated*.

#3 - I am now convinced that the slow VMware Converter results are disk I/O related (although I'm not sure what I'm going to do about it yet, if anything).

You may recall that I initially moved a guest from one 2950 (running VMware Server on W2K3) to another 2950 (running ESXi 4) at a rate of 30-40MB/s, only to find that when I tried to move it back to the first 2950 after putting ESXi on it, I could only get 8-10MB/s (so roughly 100Mbit/s, +/- a few Mbits). But later I tried moving the same guest over to a PowerEdge R410 with 7.2KRPM SATA drives in a RAID-6 (attached to a PERC 6/i), and saw 40-50MB/s!

I later took down that same R410 and removed the PERC and RAID drives so that I could put a network card in the PCIe slot in order to have something I could use to play with VMDirectPath, and attached a single 7.2KRPM SATA drive to the on-board SATA controller to use as a temporary/test VMFS store. When I moved the same guest over to that drive, I saw an average of 10-15MB/s. So, the performance of the SATA drives attached to the PERC, combined with its battery-backed cache, unsurprisingly blew away the performance of the single SATA drive.

The surprising bit is that the 2950 that I want to move this guest back to does not have a RAID controller, but does have a bunch of individual 15KRPM SAS drives attached to a Dell SAS 5/i controller. You would think that I should be able to achieve write speeds greater than 10MB/s to them, but it does not appear so. From the reading I've been doing on this subject, it sounds like ESXi really does not like controllers that don't have their own write cache memory, and that even if we didn't want these drives in a RAID array, we should have sprung for a full-blown PERC controller and just done single-drive RAID 0 volumes for every physical disk. OSes like Windows I believe can use system RAM for buffering "lazy writes" to drives, but I'm guessing that ESXi has no such mechanism and so any write cache MUST be supplied by the drive controller. If you don't have a drive controller with a dedicated write cache, write performance in ESXi is going to be abysmal. Live and learn, I guess.

#1 - My Windows guests still can only achieve 100Mbit/s UDP uploads on the 2950s, and 400Mbit/s on the R410. Perhaps it is CPU-utilization-related somehow. Still don't have an answer to this one, but I'm satisfied that it has nothing to do with the performance of traffic from guests flowing through a vSwitch now that I have been able to get a RouterOS guest under ESXi to fully saturate the gigabit link.

Thanks, everybody, for your helpful suggestions and your sympathetic ears.

All the best,

-- Nathan

J1mbo · ‎06-30-2011

Overcommittment of vCPU can cause poor network performance, as can using stock network drivers (which are updated with the vmware tools). On Debian based linux distros for example, after the tools are installed updating the vmxnet drivers as detailed can cause a massive boost in network throughput.

Re overcommitting CPU, less is usually more. So on the 2950, run a single VM with a single vCPU assigned in order to performance test (since ESXi will need CPU resource to deal with the IO itself too, allocating 4 vCPU on a quad-core box does not work too well). Any 2950 should easily saturate GbE though running like this.

Re disk, yes the write cache is absolutely needed. ESX(i) has to take a conservative approach because it is supposed to be invisible to the running guests, hence IO has to be committed when the guest thinks it is. The transfer rate seen without write cache seems to be a function of spindle speed and block size (one block written per revolution).

Hope that helps.

Rom3o · ‎11-19-2011

Hey,

I´ve have similiar problems with my guest systems, I did have ESXi 3.5 before but have upgraded to full vSphere4, all new installs.

But I do see poor network performance between windows host that are 2003 or 2008. 2008 R2 performs bad.

Have you tried Iperf between your machines? I get 4-5Mbit/s when a windows machine is a client and a server, but if the server is a Linux machine I get really good performance. Its also only the send/upload speeds that are poor, receiving speed is always good with more thatn 500-600Mbit/s

I´ve been troubleshooting this with Vmware for 2months, they couldnt solve it and now I´ve been working with Microsoft for 2 months and they can solve it either, I´ve gone thru 5-6 guys at Microsoft and now its at the developers and not even they can find the problem.

The "funny" thing is that if Vmware creates a machine for me and sends it to me I get really good performance, if I send my machine to them they get bad performance.

Do you have any similiar problems? We have tried everything as you have, same host, same storage, only a virtual switch so no real network transfers and still only 5Mbit/s.

Any help at this stage is gratefull.

Francesco

LatinSuD · ‎05-17-2012

- I have reached more than 300Mbps with these tricks:

http://forum.mikrotik.com/viewtopic.php?f=2&t=48041#p270372

Also:

- Try tu run RouterOS alone on the ESXi, else check that Latency (<1%) and CPU Ready (<50) are low. You can check this on Performance tab in ESXi or with (r)Esxtop.

- If CPU usage in the guest is too high give it more vCPU, but don't abuse or it will increase Latency and Ready time.

- Check that there are no drops in the Interfaces section of the RouterOS, and also no drops with (r)Esxtop.

- Try to avoid RotuerOS queues if possible.

- If you are interested i can tell you how to modify the RouterOS e1000 driver to increase buffers.

- Check that RouterOS RPS is enabled on high load interfaces.

To check performance:

- Do a lot of pings from RouterOS to another machine on each interface that has too much traffic. Check that ping time doesn't go high and no packet loss.

Finally note that RouterOS is kind of not supported since you cannot install VM Tools nor VMXnet afaik.