Re: Disk write performance very slow when multiple...

dchang92606 · ‎12-16-2010

I ran into an issue with slow disk write performance. My set-up is VERY basic (JBOD disks). I have an HP xw8400 with 2 x Xeon 5160 CPUs and 20GB of RAM. ESXi 4.1 is installed on a 4GB USB flash drive. There are 2 datastores (1 x 320GB SATA 7200rpm disk, 1 x 160GB SATA 7200rpm disk) and a CentOS NAS server (using samba). All VMs have vmware tools installed. I'm using the LSI SAS controller because the pvscsi driver doesn't work on boot disks. I'm using the vmxnet3 ethernet driver. Each VM is 40GB in size (not thin provisioned).

I have 5 VMs defined (clones of each other) and turned on at the same time. They are all Windows Server 2008 32-bit. However, when testing, there's only one active VM.

The testing involves mapping a Z: drive from the CentOS NAS to the VM. There are 3 different tests:

1. Copy the file (5GB) from the C: drive (disk) to the Z: drive (NAS) - this runs fast

2. Read the file on the Z: drive (NAS) - this runs fast

3. Copy the file from the Z: drive (NAS) back to the C: drive (disk) - this is slow

Here's what I found...

Disk Read performance is reasonable (for both datastores) - 60MB/sec - 82MB/sec

NAS Read performance is reasonable - 50MB/sec - 77MB/sec

NAS Write performance is reasonable - 38MB/sec - 54MB/sec

Disk Write performance is very slow - 7MB/sec - 16MB/sec

As part of my troubleshooting steps, I turned off all except two of the VMs; the disk write performance picked up. It went from about 13MB/sec (average) to over 50MB/sec (average).

Attached is a performance graph. Each horizontal square respresents about 2.5 minutes. Each vertical square represents 4.4MB/sec disk write speed. Each green bar in the graph represents one file transfer (about 5GB). Note the two tall bars on the left. Each took about 2 minutes to complete. This happened when only 2 VMs were running. Then, I brought up another VM and ran the file transfer. The middle bar shows a run time of about 8 minutes. Then, I shutdown the other 2 VMs (only 1 VM turned on) and ran the file transfer again. The file transfer went fast again and finished in 2 minutes.

Also, when the disk write speed is slow, the CPU usage is very low (0% - 21%). When the disk write speed is reasonable, the CPU usage is high (77% - 91%). These percentages are shown by 'perfmon' on the VM.

Thus, my question is: WHY does ESXi slow down so much when multiple VMs are turned on but not active? I have 4 cores and 5 VMs. All the other VMs are not active during the testing.

DC

DSTAVERT · ‎12-16-2010

Disk write caching is critical in a virtual environment. Unless you have disk write caching enabled you will have very poor performance. You usually can't enable write caching with out a battery backed caching module. More importantly you don't want to enable it without the battery since loss of power will corrupt the disk.

-- David -- VMware Communities Moderator

dchang92606 · ‎12-16-2010

David, I've read a variety of posts about disk write performance issues. Most of them point the finger at write caching (however mis-guided that may be). I don't have a disk write performance problem when a single VM is turned on. That tells me that ESXi 4.1 is capable of driving the underlying I/O subsystem (however mediocre it is) at a reasonable performance level (reads: ~ 70MB/sec, writes: ~50MB/sec) regardless of the write caching settings.

My problem is when I turn on several VMs, the performance changes (along with the use of CPU). That tells me that ESXi communicates differently with the underlying I/O subsystem when multiple VMs are turned on (not multiple active VMs, but just VMs that are turned on). It's almost as if ESXi is time slicing the access to the datastores for each VM even if it is not active. And it's doing it in a particularly bad way (i.e. single tasking even though I have 4 cores). I can see this because the CPU goes from 0% to 20% back down to 0% very frequently (about every 2-3 seconds).

Is there some sort of I/O locking mechanism that ESXi does to give each VM a chance to communicate with a datastore? If so, is that tunable?

beyondvm · ‎12-17-2010

Not active does not necessarily mean no I/O, try setting the graphs in IOPs. Also keep in mind that a 7200rpm SATA drive can only give about ~100 IOPs, they are terribly slow. I would not expect to get reasonable performance beyond about 2 VMs per SATA disk and even that is a stretch. Again, try setting the performance graph to read in IOPs and see what it reaches when you are doing your test and when the VMs are idle.

--- If you found any of my comments helpful please consider awarding points for "Correct" or "Helpful". Thanks!!! www.beyondvm.com

DSTAVERT · ‎12-17-2010

A standard installation provides equal shares to each virtual disk for each virtual machine. You can modify that behavior and enable different share values for each disk. Modifying the standard behavior will usually result in poorer overall performance unless the need is fully understood.

When you have one VM running it has 100% access to storage. Not much different than a standalone server. As for misguided, you do have write issues and the reasons are the lack of write caching. Without caching when a write operation occurs ESX(i) must wait for it to complete before handing access to storage to the next VM, idle or not.

-- David -- VMware Communities Moderator

dchang92606 · ‎12-23-2010

Thanks for the feedback. I appreciate the quick responses. I was away from the office a few days and didn't have a chance to reply. I've also ordered a couple of 15K RPM SAS drives to do some comparisons against the SATA drives. I'm also going to duplicate my environment on a server with a RAID adapter that has a battery backed-up cache.

As for the suggestion to check the IOPS because SATA drives can only perform about 100 IOPS, I looked into that. The problem is I can't find the right performance counter to look at. There's 'number of read requests' and 'number of write requests'. But, I don't think that's it because I get numbers in the 400 - 500 and it shouldn't exceed 100. Also, my understanding is that IOPS = <Bytes per second> / <chunk size>. I don't know the 'chunk size'. Is it 1MB because my VMFS block size is 1MB? What's the best way to get IOPS values?

As for the equal distribution of I/O to all servers and the write cache issue, I agree that write-caching would be a huge benefit to real world situations where the I/O requests are choppy. Given a sufficiently large cache, I/O operations appear tremendously fast. However, when the cache is saturated, then I/O operations will slow down (in fact, having a cache between large I/O operations can hinder performance). In synthetic I/O testing, all of the operations involve data volumes that are significantly larger than ordinary RAID adapter caches. Even large SANs have limited caches. One of the operations I am testing is a database backup. Thus, for a small database (< 20GB used space), in order to back it up, the computer will have to read 20GB of data and write 20GB of data.

Part of my frustration is that I'm running ESXi and cannot analyze the I/O subsystem very well. That is, if I were running ESX, I could use all of the tools of the underlying operating system to tell me what's going on. With ESXi, I have no visibility into that. Also, the behavior of the CPU is kind of strange. If the other guest VMs are turned on and doing small amounts of I/O while one VM is doing a large I/O operation, wouldn't the transfer rates and CPU utilization be more volatile? That is, the transfer rates are pretty consistent and so is the CPU utilization. It's almost like there is a governer of some sort holding back one VM from consuming all the resources (unless there is only one VM turned on).

All

Disk write performance very slow when multiple VMs are powered on but not active