We are running the Free Hypervisor on hosts using DAS only (Crazy, but I didn't get a say in this). Specifically we are running Dell R510 servers with (12) 300GB SAS drives. We have been running various tests to try to determine what would be the best disk setup / configuration to use in an attempt to maximize the performance of our DAS. RAID 10 is a requirement, so based on that here is some interesting performance data that we acquired through various tests using Bonnie++.
1. Disk Layout - our options were: (1) 12 disk RAID10, (2) 6 disk RAID 10, (3) 4 disk RAID 10. the 12 disk setup loses this battle on nearly every metric. The 6 disk and 4 disk perform similarly although for our purposes we chose to go with the 6 disk setup to give us more platters. If I remember right reads were better on the 6 disk and writes were slightly better on the 4 disk. Anyway, the biggest difference was see when using 12 disks... that sucks compared to the other two setups.
WINNER; (2)6 disk RAID 1 arrays.
2. Aligned versus misaligned - We will be taking many VMs out of an old ESXi3.5 / SAN storage environment and move them to DAS, most if not all of the VMs are misaligned. I'm not even going to attempt to align their C drives, but would it be worth my time to align the D and E drives of these machines? There is defiantly a benefit to aligned VMs, however not enough to merit fixing all of the old ones (700). We have very few VM that are DB servers, but I suspect we will try to align those. The results we got back from various tests with bonnie+ were all over the place. Sometimes the results even showed a disadvantage which makes no sense and can only be explained as falling into some unmeasured and unknown margin of error. At no point did we see the alleged 40% performance increase... typical numbers were closer to 5%-10%.
WINNER: Aligned, but only worth the effort on DB servers or as we create new VMs.
3. RAID array stripe unit size - This is how much data is written to one disk before having to jump to the next disk within a single stripe. evidently performance in general follows some sort of bell curve. We used the default 64k of our H700 RAID controller as the control and found that setting this to 256k gave us the best overall performance gains.One would think 1024k would be best but that was not the case. Ok, so here are some raw percentages we pulled out of our tests using bonnie+, here we are looking at the advantage of going with 256K stripe units:
With large files (256MB) we see:
56% improvement in Seq Block Writes
26% improvement in Seq Rewrites
6% improvement in Seq Block Reads
22% improvement in Random Seeks per Sec
And for many small files (1KB) we see:
22% improvement in Seq Creates
75% improvement in Seq Reads!!!!!
23% improvement in Seq Deletes
19% improvement in Random Creates
68% improvement in Random Reads!!!!!!
32% improvement in Random Deletes
WINNER: 256k Stripe Unit size
4. VMFS block size - Regardless of what block size you choose (1MB, 2MB, 4MB, or 8MB) the Hypervisor is going to break it down into 64KB sub blocks. One would assume that 1MB should out perform 8MB... but this was not the case. In general with all of the sequential stuff 8MB outperforms 1MB by about 20% for large files (256MB) and performs just a tiny bit better on small files (even on random access). We have one or two VMs that have rather large disks and so a while back to eliminate issues we just blindly set everything to 8MB. Now we were second guessing that decision, but it seems we are just fine sticking with 8MB. Will we ever have a VM with 2TB disks... NO, but if we have to even touch this setting, we might as well max it out.
WINNER: 8MB simply because that is the road we have already started down, and it comes at no performance cost.
These are my findings, has anyone else tested this stuff? Did you see a huge benefit to using 256k stripe units? I was testing with aligned linux systems running bonnie++, has anyone done similar testing with Windows VMs running sqlio.exe or anything like that?
Thanks for taking the time to post this info. I am sure it will be helpful to a lot of people.
I did a bunch of performance testing some months ago, but I was really only testing RAID5 vs RAID10 and didn't explore performance with different sizes of RAID10 arrays. There certainly was improvement with RAID10 compared with RAID5. In my environment we have decided to go with RAID5 for most LUNs though, so as to maximise storage capacity, but we will use RAID10 for running SQL Server databases.
Regarding your comments on partition alignment ... I've only ever done this with a physical server ... but if you have Server 2003 or earlier I think it is well worth the time investment to get the disks aligned, as you will probably see a decent performance boost. In my case we had a SQL server with 50GB database (incl. one table of roughly 40M rows) where the DBs were on a separate RAID5 😧 drive. After partition alignment was done we saw a 14% improvement in raw disk performance (measured with HDTune). We also took the opportunity to format the 😧 drive with 64KB cluster size. When we tested some reports (SQL Reporting Services) we found up to a 40% improvement in performance, which was huge.
these are interesting findings. Can you please tell what controller do you use? H200 / H700 ?
So far all of our systems had used the H700 controller.