|
Reply
Re: ESXi 3.5 U4 on HP DL380 G6 has slow disk performance Jul 2, 2009 5:26 AM
Reply
15.
Re: ESXi 3.5 U4 on HP DL380 G6 has slow disk performance Jul 2, 2009 5:26 AM
No further changes were needed to anything. I could not find any way to enable write caching on the RAID controller. I assume it's an automatic setting which is enabled when the RAID controller detects the battery pack attached to the RAID cache DIMM. The only indication that the 1GB RAID cache with the battery pack was installed (other than a speed increase in writes) was the RAID controller message displayed on the server's console during power on. I seem to recall that a message was initially displayed after installing the 1GB RAID cache with the battery pack indicated that the battery pack was not fully charged and RAID performance would not optimal at that time. |
|
Reply
Re: ESXi 3.5 U4 on HP DL380 G6 has slow disk performance Jul 2, 2009 5:28 AM
|
|
Reply
Re: ESXi 3.5 U4 on HP DL380 G6 has slow disk performance Jul 2, 2009 5:46 AM
Reply
17.
Re: ESXi 3.5 U4 on HP DL380 G6 has slow disk performance Jul 2, 2009 5:46 AM
It should have been a sequential write test. I tested write performance by observing the VI Client performance chart for disk I/O while creating a 1 GB file on a RHEL4 VM. I believe the command was "dd if=/dev/zero of=/tmp/test.txt bs=1k count=1000000". As far as partition alignment issues are concerned, I have no idea if that's a possibility. However, the datastore creation and VM creation/installation was performed via the VI Client. |
|
Reply
Re: ESXi 3.5 U4 on HP DL380 G6 has slow disk performance Jul 3, 2009 6:26 AM
Reply
18.
Re: ESXi 3.5 U4 on HP DL380 G6 has slow disk performance Jul 3, 2009 6:26 AM
Hi, same slow disk performance problem. I've downgraded the firmware version to 1.58C but this is a rate... time dd if=/dev/zero of=/tmp/test.txt bs=1k count=1000000 1000000+0 records in 1000000+0 records out real 1m1.819s user 0m0.300s sys 0m2.380s the I/O is less to 20MB/s... You can try the same test and past here the output? Thanks in advance! |
|
Reply
Re: ESXi 3.5 U4 on HP DL380 G6 has slow disk performance Jul 3, 2009 7:34 AM
Reply
19.
Re: ESXi 3.5 U4 on HP DL380 G6 has slow disk performance Jul 3, 2009 7:34 AM
Might be worth retesting with a bigger block and file size. On my test environment (Perc 5i raid-5, 3x sata 500GB), 1K block size 1GB file gives 15MB/s. 64K block 1GB file gives >250MB/s (effect of 512MB write cache), 64K block 2GB file gives ~60MB/s which is more realistic. Note this is dd for Windows though. |
|
Reply
Re: ESXi 3.5 U4 on HP DL380 G6 has slow disk performance Jul 6, 2009 11:05 AM
Reply
20.
Re: ESXi 3.5 U4 on HP DL380 G6 has slow disk performance Jul 6, 2009 11:05 AM
Thanks for posting that adding the BBWC helped improve your performance. I have 3 ESXi 3.5 boxes which are HP DL360Gen5's, all with similar slowness. I've requested to my manager that we need to purchase these BBWC to improve the situation. It's just good seeing another confirmed instance that this helped resolve the issue. |
|
Reply
Re: ESXi 3.5 U4 on HP DL380 G6 has slow disk performance Jul 16, 2009 11:46 AM
Reply
21.
Re: ESXi 3.5 U4 on HP DL380 G6 has slow disk performance Jul 16, 2009 11:46 AM
We got our BBWC in this week and got it installed. After about 24 hours, we tested write speeds and are now seeing write speeds of approx 60MB/s on our RAID5 array...which we can live with. Sure beats the pants off 4MB/s |
|
Reply
Re: ESXi 3.5 U4 on HP DL380 G6 has slow disk performance Jul 16, 2009 12:35 PM
Reply
22.
Re: ESXi 3.5 U4 on HP DL380 G6 has slow disk performance Jul 16, 2009 12:35 PM
Just a note, we have an open call with HP on the P410 SmartArray. Under certain specific conditions on a raid 5 array with 512mb cache we have managed to reliably hang up the raid controller in both Windows 2003 and RHEL 5.
On Windows using IO meter we have seen brilliant performance when just read or just write IO's but when using small IO with combinded read/write the performance VERY quickly degrades where after about 30minutes the disks stop responding. I would suggest you log a call with HP. |
|
Reply
Re: ESXi 3.5 U4 on HP DL380 G6 has slow disk performance Jul 19, 2009 2:30 PM
Reply
23.
Re: ESXi 3.5 U4 on HP DL380 G6 has slow disk performance Jul 19, 2009 2:30 PM
I have recently rolled out an HP ML330 G6 server with P410 zero cache controller and 3x250Gb SATA near-line disks; am running a virtualised version of the old client server in the main VM (only VM at present) and this was achieved by using the VMware converter - all went reasonably well. The disk arrangement is two of the disks are mirrored with the 3rd as a hot spare.
First half morning of semi-live tests indicate that there is a performance issue - which I relatively quickly had down as a write cache issue. Used Windows Performance monitor (disk queue length>50 for much of the time) and IOMeter showing disk throughput lower than I would expect. I have just discovered that HP ship all their disks with the onboard DISK write cache switched off (I am guessing that this is just in case you are using zero Mb controller with no battery backed cache and maybe you do not run a server with a UPS either); I guess I must have missed the great big NEON sign that said - "for customer safety we have ensured that your write performance will be abysmal - just in case you are silly enough to run without safety nets"). I have not been able to go back to check that switching the ondisk caching back on helps (4am site visit tomorrow am will give us the results - argh). I hope the above diatribe and debrief helps anyone else. Of course - I am wondering if people are running their server with decent levels of battery backed cache, but that may still have the default ondisk cache off - I guess they might never realise that the ondisk cache is OFF and could be enjoying even better performance - what a waster that would be. Makes you wonder why anyone works in IT - when the default settings make an expensive SAS raid controller running mirrored disks slower than a very slow thing indeed. I will report back late late tomorrow or perhaps even Tuesday when my brain returns from its hiding place. Any feedback welcomed. |
|
Reply
Re: ESXi 3.5 U4 on HP DL380 G6 has slow disk performance Jul 19, 2009 2:58 PM
Reply
24.
Re: ESXi 3.5 U4 on HP DL380 G6 has slow disk performance Jul 19, 2009 2:58 PM
Unfortunately that servers disk spec is very poor for ESXi. I read recently that 'near-line' is derived from the drives being somewhere between on-line and off-line (tape) storage - generally little more than desktop SATA disks. In any case the controller based BBWC is critical. The underlying disk write cache I'm not too sure will make any measurable difference when a controller BBWC is installed (and particularly with SAS disks). As a point of interested, I performed a clean shut-down on my test ESXi box (which has 512MB BBWC on a Dell Perc 5i), turned it off and disconnected the battery for a few moments. The result was that the volume was corrupted and I had to restore all my VMs from a backup - I would conclude that enabling write-caching anywhere without battery backup is pretty brave. |
|
Reply
Re: ESXi 3.5 U4 on HP DL380 G6 has slow disk performance Jul 19, 2009 2:59 PM
Reply
25.
Re: ESXi 3.5 U4 on HP DL380 G6 has slow disk performance Jul 19, 2009 2:59 PM
Paul Mead wrote:
Of course - I am wondering if people are running their server with decent levels of battery backed cache, but that may still have the default ondisk cache off - I guess they might never realise that the ondisk cache is OFF and could be enjoying even better performance - what a waster that would be. Oh boy, this has been discussed and disputed for eons. THOU SHALL NOT USE WRITE CACHE ON THE DISK ITSELF. Anything could happen, if your disk controller and your OS are thinking, they wrote the data and it hasn't happen. There is more than the main power that could fail. Go buy a BBWC for your controller, if you are depending on your VMs and your data. |
|
Reply
Re: ESXi 3.5 U4 on HP DL380 G6 has slow disk performance Jul 19, 2009 6:56 PM
Reply
26.
Re: ESXi 3.5 U4 on HP DL380 G6 has slow disk performance Jul 19, 2009 6:56 PM
Well - I am back; mixed feelings - especially with some of the feedback - thanks to all for your input so far.
Additional info: The server was first prepped at our site and then taken to client site for final config and physical to virtual conversion. When the server first came online after the move, but before conversion, one of the 2 mirrored disks had complained and appeared to be performing a rebuild. I let this run overnight before the 1/2 day semi live run. Tonight I have been over to the client out of hours and run a few tests. Running IOMeter with 2 workers with 2 access specifications 32kblocks with 100%read and 0%random and the 2nd with 32k blocks with 100%write and 0%random for 1 minute for each of the access specifications (2 minute total run); (In simple terms a seq read test and then a seq write test); I then proceeded to delberately switch OFF the ondisk caching (which I had in fact switched on during initial build - just thought I had not) ran the tests and then reran the tests with ondisk caching switched back on. FULL physical server and virtual server power down and restart for completeness before each test. Summary Results: OnDisk caching OFF: avg 47MBps read and avg 5.6MBps writes OnDisk caching ON: avg 66MBps read and avg 52MBps writes So that is looking quite good; unfortunately I do not have exactly the same results for comparison from the previous 24 hours of poor performance; although the closest comparison would seem to be: Bad/Weird State with partial caching?: avg 37MBps read and avg 29MBps writes Interim conclusion: I suspect that perhaps the disk/array problem after the server physical move may have left the ondisk caching in an odd state? What I would say is that although things seem to be better - at times the server avg disk queue length is still higher than I would normally be happy with - I suspect that we may need to invest in some BBWC kit. Just for completeness: the disks used in this server are 3x 250Gb "Midline" disks pn: 458926-B21 If anyone has the same setup but with 256 or 512Mb cache with or without battery backup - I would be very interested to see your results; I can send the iometer settings file across if desired. My own virtualised server running running in a similar fashion using an LSI Megaraid SATA 300-8XLP SATA II raid adapter with 128Mb embedded cache and 2x1.5TB pn: WD15EADS mirrored disks gets figures of 92Mbps read and 110Mbps write! seems odd, but overall appears nippy enough in practice and no long disk queue during any of the normal usage or iometer workloads tested so far (obviously not trying hard enough). |
|
Reply
Re: ESXi 3.5 U4 on HP DL380 G6 has slow disk performance Jul 20, 2009 12:48 AM
|
|
Reply
Re: ESXi 3.5 U4 on HP DL380 G6 has slow disk performance Jul 20, 2009 8:21 AM
Reply
28.
Re: ESXi 3.5 U4 on HP DL380 G6 has slow disk performance Jul 20, 2009 8:21 AM
grantdavies: We recently added the BBWC to a DL360 Gen 5 that was performing slowly in our data centre. I took the server down to physically install the battery. Then booted the server with Smart Start to ensure that the cache ratios were setup to the defaults (75% write and 25% read). They were there automatically. I then booted the server back up into VMWare ESXi 3.5. Approx 24 hours later, i confirmed that speeds were approx 60-70MB/s...which was a huge improvement over previous speeds of 2-4MB/s. Thus, I did nothing else other than shut down, plug in battery and turn back on. It's worth noting too that I had a P400i with 256MB of cache and I left the 256MB of cache, just added the BBWC and our problems are resolved. |
|
Reply
Re: ESXi 3.5 U4 on HP DL380 G6 has slow disk performance Jul 29, 2009 10:15 AM
Reply
29.
Re: ESXi 3.5 U4 on HP DL380 G6 has slow disk performance Jul 29, 2009 10:15 AM
Thanks to all, but this was the issue; due to there being no raid cache on the ML330 G6 P410 raid controller server as standard was a significant write performance hit (Performance utility and disk queue. It was eventually found that the NOD32 ESET antivirus product which was using XMON to check the exchange datastore; everytime the av sig was updated, the entire exchange datastore would be rechecked! Normally this would not matter as most of the time our servers and our clients servers are performing optimally and the effect of the scan is masked. Due to the cache issue, the scan effect dragged down the entire server performance. Disk queues could be as high as 40-50 for tens of minutes at a time. The XMON background scanning can be turned off (in fact some recommend that it is turned off, despite it being on by default); We have resolved the overall issue by adding 512Mb battery backed cache to the controller and performance is much much much better and I believe that if we had started with the cache, we would never have noticed any issues at all. Please note that the ESET NOD32 XMON issue is not at fault - it simply highlighted the disk performance issue. My main gripe is that for some reason HP sell the current range of ML330 G6 generation of servers with NO=Zero cache memory as standard on the P410 controller! Silly me thought that they would not possibly sell something so handicapped.Not only that, when you decide to upgrade you can either purchase a 256Mb module or 512Mb with battery. Why not 256Mb with battery? FYI we now get disk queues of <2 most of the time and occasional blips of higher queue levels - but it is like comparing the norfolk broads with Wales! IOMeter tests show throughputs of 160MBs reads and 230MBs writes (not bad for 7200rpm mirrored pair) - which compares to early non-bbwc results of approx 66MBps read and 52MBps writes. These non-bbwc results are worse than the headline values indicate due to the type of iometer test and also the fact that the disk queue length was so high too during the test. Still a learning exercise for me - (1) ESXi NEEDS a properly cached controller for decent write performance - more so that running a physcial box it would seem and (2) not to take HP specs at face value. I would love to get some feedback from HP on this matter as I was intending to standardise on their kit as we seek to use ESXi for all our clients i.e. why sell a raid controller with zero cache memory and not insist on cache upgrade or at least make it very clear tha you are buying something which for many purposes will be too slow? I sign off somewhat bemused. |