jpreou
Contributor
Contributor

Slow network file/copy performance

Accidentally posted this in ESX3.0, so re-posting here...

We've recently installed an HP C3000 chassis with three BL460c blades each running ESX 3.5 (Foundation, updated to just prior to 'Update2') and backending to an HP MSA2012fc SAN. The blade chassis has Gb2E ethernet switches which connect to a Cisco 3750 core switch. What we have noted is that CPU utilization on the File/Printer server (which was P2V'd using VMware Converter) is pegged during network activity and network file transfer performance is slower than expected. Looking at historical logs we can easily see the CPU peaks from 8am to 6pm when users access the server. It did not exhibit this same behaviour when it was a physical server. We're seeing 90%+ CPU during business hours and virtually nothing outside of hours. Even after a full cold-reset of the entire environment while we re-racked the server room post initial install we still saw high CPU, though not quite up to the 90% levels (around 45%-60% now). Performing some simple 'real world' diagnostics using file copy tests we noted the following:-

  1. Copying a 370MB file between freshly built VMs is just fine, regardless of which host they are on.

  2. Copying the same file between P2V'd VMs on seperate hosts is also just fine.

  3. Copying the same file between P2V'd VMs on the same host (i.e. not even touching the external network) is slow and pegs the CPU.

  4. I rolled out two new servers from template (fresh builts) and copied the same file between them (for each host, using an "Internal Only" network with no uplinks). All good.

By slow, I mean that normally the 370MB file takes around 10-15 seconds to copy between VMs, but in scenario 3 it takes around 3½ minutes. It only seems to be between P2V'd machines, and only when the machines are on the same ESX host. In each case the CPU pegs to 100% during the file copy. We have checked this on other ESX servers at other customers and don't see the same behaviour there.

When doing the P2V we went through all the 'normal' post-cleanup tasks. Took the HAL back to single proc (for 1vCPU), removed hidden and non-present devices, uninstalled all software, utilities, drivers that were not required (like all the HP stuff), installed VMware Tools, etc. All the network cards, switches, interfaces, etc are all set to auto-negotiate and are Gbit and I have confirmed that everything is running at full speed and full duplex. Network cards in VMs are using the Flex driver installed with Tools. In theory, though, network copies between VMs on the same host shouldn't even touch the physical network.

Anyone got any ideas for me before I call VMware?

0 Kudos
3 Replies
JRink
Enthusiast
Enthusiast

Are these Win2000 VMs? Win2003 VMs? I had an issue with my P2V'd Win2000 machine until I got the HALs sorted out. I made a thread about it if you want you can search for thread under my username. The VM eventually cleared up after adjusted the HAL and number of processors, etc.

0 Kudos
jpreou
Contributor
Contributor

Sorry, the VMS are all Win2003 Standard. Some SP1, some SP2. Upgrading the file/print from SP1 to SP2 didn't help. All VMs have had the number of vCPUs set from two (or more) to a single vCPU following the P2V conversion but before first power up. All VMs booted just fine and the first task completed was to update the HAL to the single CPU/Uniprocessor HAL as appropriate. We then restarted, removed all unnecessary software (like all the HP drivers and software, etc). Restarted again, removed all hidden and non-present hardware from Device Manager. Re-started again. Installed VMware Tools, and then finally re-started again. Configured all the IP addressing, etc, shut down the production physical box, swapped the network over from "Internal Only" to "Production LAN" and re-started again if necessary to resolve networking issues. The whole P2V process went very smoothly apart from this performance issue we are seeing when copying files from one VM to another on the same host, but only the P2V'd VMs, not the clean built ones.

0 Kudos
seniornwb
Contributor
Contributor

Did you ever get this fixed ?

I am running a P2V'd W2k box and a P2V'd W2K3 box on the same physical server on ESX 3.5 U4. when I copy files from one to the other it is terribly slow. When I look at t he performance graph in ESX I can see utilization of the guest is reported above 50% all the time during copy. When I look at the performance monitor within windows (either 2000 or 2003) it shows only 2-3 % cpu utilization ??

Kind regards,

Hen

0 Kudos