VMware Cloud Community
Keith001
Contributor
Contributor

ESXi 3.5 U4 on HP DL380 G6 has slow disk performance

I've installed ESXi 3.5 U4 with and without the latest VMware patches on an HP DL380 G6 server with a p410i RAID controller configured with 8 SAS drives in a RAID 50 configuration. The server firmware (RAID controller and BIOS) have been updated to the most current versions.

Unfortunately, a RHEL4 VM only achieves a maximum of 5-6 MB/s in disk IO for write operations. When I run 2 different RHEL4 VM's simultaneously, the disk write performance is 2.5-3 MB/s each.

I'm fairly certain the HW is good because I've performed a native installation of RHEL5U3 on the server and have used that OS to do a local file copy at 250+ MB/s.

Does anyone have any knowledge of performance issues regarding the HP Smart Array p410i RAID controller and the VMware ESXi 3.5 U4 cciss driver?

Reply
0 Kudos
32 Replies
patparks1
Contributor
Contributor

Thanks for posting that adding the BBWC helped improve your performance. I have 3 ESXi 3.5 boxes which are HP DL360Gen5's, all with similar slowness. I've requested to my manager that we need to purchase these BBWC to improve the situation. It's just good seeing another confirmed instance that this helped resolve the issue.

Reply
0 Kudos
patparks1
Contributor
Contributor

We got our BBWC in this week and got it installed. After about 24 hours, we tested write speeds and are now seeing write speeds of approx 60MB/s on our RAID5 array...which we can live with. Sure beats the pants off 4MB/s

Reply
0 Kudos
Rabie
Contributor
Contributor

Just a note, we have an open call with HP on the P410 SmartArray. Under certain specific conditions on a raid 5 array with 512mb cache we have managed to reliably hang up the raid controller in both Windows 2003 and RHEL 5.

On Windows using IO meter we have seen brilliant performance when just read or just write IO's but when using small IO with combinded read/write the performance VERY quickly degrades where after about 30minutes the disks stop responding.

I would suggest you log a call with HP.

Reply
0 Kudos
Paul_Mead
Contributor
Contributor

I have recently rolled out an HP ML330 G6 server with P410 zero cache controller and 3x250Gb SATA near-line disks; am running a virtualised version of the old client server in the main VM (only VM at present) and this was achieved by using the VMware converter - all went reasonably well. The disk arrangement is two of the disks are mirrored with the 3rd as a hot spare.

First half morning of semi-live tests indicate that there is a performance issue - which I relatively quickly had down as a write cache issue. Used Windows Performance monitor (disk queue length>50 for much of the time) and IOMeter showing disk throughput lower than I would expect.

I have just discovered that HP ship all their disks with the onboard DISK write cache switched off (I am guessing that this is just in case you are using zero Mb controller with no battery backed cache and maybe you do not run a server with a UPS either); I guess I must have missed the great big NEON sign that said - "for customer safety we have ensured that your write performance will be abysmal - just in case you are silly enough to run without safety nets").

I have not been able to go back to check that switching the ondisk caching back on helps (4am site visit tomorrow am will give us the results - argh). I hope the above diatribe and debrief helps anyone else.

Of course - I am wondering if people are running their server with decent levels of battery backed cache, but that may still have the default ondisk cache off - I guess they might never realise that the ondisk cache is OFF and could be enjoying even better performance - what a waster that would be.

Makes you wonder why anyone works in IT - when the default settings make an expensive SAS raid controller running mirrored disks slower than a very slow thing indeed.

I will report back late late tomorrow or perhaps even Tuesday when my brain returns from its hiding place. Any feedback welcomed.

Reply
0 Kudos
J1mbo
Virtuoso
Virtuoso

Unfortunately that servers disk spec is very poor for ESXi. I read recently that 'near-line' is derived from the drives being somewhere between on-line and off-line (tape) storage - generally little more than desktop SATA disks. In any case the controller based BBWC is critical. The underlying disk write cache I'm not too sure will make any measurable difference when a controller BBWC is installed (and particularly with SAS disks).

As a point of interested, I performed a clean shut-down on my test ESXi box (which has 512MB BBWC on a Dell Perc 5i), turned it off and disconnected the battery for a few moments. The result was that the volume was corrupted and I had to restore all my VMs from a backup - I would conclude that enabling write-caching anywhere without battery backup is pretty brave.

Reply
0 Kudos
Jackobli
Virtuoso
Virtuoso

Of course - I am wondering if people are running their server with decent levels of battery backed cache, but that may still have the default ondisk cache off - I guess they might never realise that the ondisk cache is OFF and could be enjoying even better performance - what a waster that would be.

Oh boy, this has been discussed and disputed for eons.

THOU SHALL NOT USE WRITE CACHE ON THE DISK ITSELF.

Anything could happen, if your disk controller and your OS are thinking, they wrote the data and it hasn't happen. There is more than the main power that could fail.

Go buy a BBWC for your controller, if you are depending on your VMs and your data.

Reply
0 Kudos
Paul_Mead
Contributor
Contributor

Well - I am back; mixed feelings - especially with some of the feedback - thanks to all for your input so far.

Additional info: The server was first prepped at our site and then taken to client site for final config and physical to virtual conversion. When the server first came online after the move, but before conversion, one of the 2 mirrored disks had complained and appeared to be performing a rebuild. I let this run overnight before the 1/2 day semi live run.

Tonight I have been over to the client out of hours and run a few tests. Running IOMeter with 2 workers with 2 access specifications 32kblocks with 100%read and 0%random and the 2nd with 32k blocks with 100%write and 0%random for 1 minute for each of the access specifications (2 minute total run); (In simple terms a seq read test and then a seq write test); I then proceeded to delberately switch OFF the ondisk caching (which I had in fact switched on during initial build - just thought I had not) ran the tests and then reran the tests with ondisk caching switched back on. FULL physical server and virtual server power down and restart for completeness before each test.

Summary Results:

OnDisk caching OFF: avg 47MBps read and avg 5.6MBps writes

OnDisk caching ON: avg 66MBps read and avg 52MBps writes

So that is looking quite good; unfortunately I do not have exactly the same results for comparison from the previous 24 hours of poor performance; although the closest comparison would seem to be:

Bad/Weird State with partial caching?: avg 37MBps read and avg 29MBps writes

Interim conclusion: I suspect that perhaps the disk/array problem after the server physical move may have left the ondisk caching in an odd state?

What I would say is that although things seem to be better - at times the server avg disk queue length is still higher than I would normally be happy with - I suspect that we may need to invest in some BBWC kit.

Just for completeness: the disks used in this server are 3x 250Gb "Midline" disks pn: 458926-B21

If anyone has the same setup but with 256 or 512Mb cache with or without battery backup - I would be very interested to see your results; I can send the iometer settings file across if desired.

My own virtualised server running running in a similar fashion using an LSI Megaraid SATA 300-8XLP SATA II raid adapter with 128Mb embedded cache and 2x1.5TB pn: WD15EADS mirrored disks gets figures of 92Mbps read and 110Mbps write! seems odd, but overall appears nippy enough in practice and no long disk queue during any of the normal usage or iometer workloads tested so far (obviously not trying hard enough).

Reply
0 Kudos
J1mbo
Virtuoso
Virtuoso

The performance stats gathered confirm that write caching (or lack thereof) is the problem. However the current configuration is dangerous.

Reply
0 Kudos
patparks1
Contributor
Contributor

grantdavies:

We recently added the BBWC to a DL360 Gen 5 that was performing slowly in our data centre. I took the server down to physically install the battery. Then booted the server with Smart Start to ensure that the cache ratios were setup to the defaults (75% write and 25% read). They were there automatically. I then booted the server back up into VMWare ESXi 3.5.

Approx 24 hours later, i confirmed that speeds were approx 60-70MB/s...which was a huge improvement over previous speeds of 2-4MB/s.

Thus, I did nothing else other than shut down, plug in battery and turn back on. It's worth noting too that I had a P400i with 256MB of cache and I left the 256MB of cache, just added the BBWC and our problems are resolved.

Reply
0 Kudos
Paul_Mead
Contributor
Contributor

Thanks to all, but this was the issue; due to there being no raid

cache on the ML330 G6 P410 raid controller server as standard :smileyalert: there

was a significant write performance hit (Performance utility and disk

queue. It was eventually found that the NOD32 ESET antivirus product

which was using XMON to check the exchange datastore; everytime the av

sig was updated, the entire exchange datastore would be rechecked!

Normally this would not matter as most of the time our servers and our

clients servers are performing optimally and the effect of the scan is

masked.

Due to the cache issue, the scan effect dragged down the entire

server performance. Disk queues could be as high as 40-50 for tens of

minutes at a time.

The XMON background scanning can be turned off (in fact some

recommend that it is turned off, despite it being on by default); We

have resolved the overall issue by adding 512Mb battery backed cache to

the controller and performance is much much much better and I believe

that if we had started with the cache, we would never have noticed any

issues at all.

Please note that the ESET NOD32 XMON issue is not at fault - it simply highlighted the disk performance issue.

My main gripe is that for some reason HP sell the current range of ML330 G6 generation of servers with NO=Zero cache memory as standard on the P410 controller! Silly me thought that they would not possibly sell something so handicapped.Not only that, when you decide to upgrade you can either purchase a 256Mb module or 512Mb with battery. Why not 256Mb with battery?

FYI we now get disk queues of <2 most of the time and occasional blips of higher queue levels - but it is like comparing the norfolk broads with Wales! IOMeter tests show throughputs of 160MBs reads and 230MBs writes (not bad for 7200rpm mirrored pair) - which compares to early non-bbwc results of approx 66MBps read and 52MBps writes. These non-bbwc results are worse than the headline values indicate due to the type of iometer test and also the fact that the disk queue length was so high too during the test.

Still a learning exercise for me - (1) ESXi NEEDS a properly cached controller for decent write performance - more so that running a physcial box it would seem and (2) not to take HP specs at face value.

I would love to get some feedback from HP on this matter as I was intending to standardise on their kit as we seek to use ESXi for all our clients i.e. why sell a raid controller with zero cache memory and not insist on cache upgrade or at least make it very clear tha you are buying something which for many purposes will be too slow? I sign off somewhat bemused.

Reply
0 Kudos
grantdavies
Contributor
Contributor

OK, this is perplexing me. A week ago we installed a 512mb cache DIMM with battery, and a week later (SURELY enough time for the battery to charge) and we're getting sustained write performance of about 15MB/s, up from 2-4, but nowhere near what it should be. The array is 3 x 450GB 15K SAS disks. I haven't looked closely at the burst speed, but surely when doing sustained read/write operations (eg. backups and restores) you would expect faster performance than 15MB/s!!! A saturated USB2 bus can usually manage 30-35MB/s consistantly. Any thoughts?

Reply
0 Kudos
Rabie
Contributor
Contributor

There is a know issue with the P410 controller which has been resolved with update 2.00 of the firmware.

And although it says 2009/07/29 it was only generally availible mid Aug.

http://h20000.www2.hp.com/bizsupport/TechSupport/SoftwareDescription.jsp?lang=en&cc=us&prodTypeId=15...

Version: 2.00 (27 Jul 2009)

Fixes

  • Fix to address high latency issues seen during heavy I/O conditions
    involving non-contiguous offset writes and RAID 5/6 write operations.

  • Fix for a potential controller lockup condition when a drive is hot removed during a read operation.

  • Additional fixes to address rare occurrences of controller unresponsiveness during heavy I/O stress tests in a lab environment.

|

Reply
0 Kudos
grantdavies
Contributor
Contributor

Thanks very much for sharing the info, since this server is located at a remote site, I'll wait until out next on site visit and let you know the results once I've updated the firmware.

Reply
0 Kudos