RDM vs. VMFS...again...

mcsenerd · ‎08-15-2009

Yeah yeah yeah...I know this subject seems to have been beat to death, but apparently, at lease at my organization...there is still a healthy amount of ignorance/disbelief/incredulousness (is that a real word? ). Here's my current back story:

Although I'm the only certified person that eats/breathes/lives this VMware stuff in my organization...I'm yet only a lowly Sys admin and I'm beholden to the spread thin resources of a far too multifaceted architecture group that makes most design decisions. These folks plopped in my lap one day a design for a two host cluster that was to host a very high profile legal retention and forensics application for our company. The design called for the use of 2 Dell PE710's w/48GB of RAM and dual Intel E5530's w/Hyperthreading enabled. This application is very dependant on the ability to move large amounts of often small files from one VM to another (sometimes...sometimes on the other host as well. After a proof of concept was ran using very underpowered hardware (desktops and buffalo NAS storage), we proceeded to build and deploy this system. Three of the VMs that were to be hosted in this design were pure file servers. They each had several large VMDKs (around 1.7 to 1.8 TB) with single volumes on single RAID 5 LUNs hosted on a CX3-80. None of these LUNs had VMDKs on them belonging to any other machine. Well...when application performance testing began, it was immediately noted that file copy times were extremely slow...much slower than even the POC was. So....we had previously deployed another cluster for a specific application (Documentum) that utilized MSCS and used RDM for supposed performance reasoning. So...the gist from some folks was that we should try doing the RDM thing here as well to see if it improved performance any. Of course, I instantly piped up and stated that it was VMware's position that RDM should not be utilized solely on the basis of performance and that any improvement according to their testing and documentation would only amount to around 6-8% improvement at best and that such improvement wouldn't offset the increased management burden and other caveats involved with using RDM in a virtualized environment. (Yes...I've read all of the white papers and all of big boy blog posts on the topic...) In any case...I was overruled and we modified one of the file servers to utilize RDM instead of a VMDK hosted on a VMFS store. Well...lo and behold the performance increased (or rather file copy time decreased) somwhere around 40-45%.

My thoughts are...well...this must mean that we've got something wrong in our current VMFS design if there's that kind of performance delta there. But what I'm really looking for right now is some real-world information. I don't need information to go lookup whitepapers on testing by someone else...I'm looking for first-hand experience. Should I be seeing such a difference in file copy performance merely moving to RDM from pure VMFS? If not...where should I look first in our current VMFS design to improved performance? (We already make sure partitions are properly aligned, we only utilize block sizes that accomodate the size of VMDK we expect to use, and we also usually utilize Diskeeper to pre-expand the MFT to its recommended size before placing VMs into production...). Also...if you've noted such a performance increase...could you elaborate and explain what led you to that point? Thanks in advance!

azn2kew · ‎08-15-2009

Marginally, there isn't any performance different with RDM vs. VMFS but personal preference and how you want to manage it with MSCS or flexibility with VMFS. If there are number of disbeliefs, the best way is to set them up and test it side by side or read tested articles and blogs to prove the point. I do use both RDM & VMFS at most cases because it has the pros/cons on its own as you may already know. Here are some links that may helpful for readings

If you found this information useful, please consider awarding points for "Correct" or "Helpful". Thanks!!!

Regards,

Stefan Nguyen

VMware vExpert 2009

iGeek Systems Inc.

VMware, Citrix, Microsoft Consultant

If you found this information useful, please consider awarding points for "Correct" or "Helpful". Thanks!!! Regards, Stefan Nguyen VMware vExpert 2009 iGeek Systems Inc. VMware vExpert, VCP 3 & 4, VSP, VTSP, CCA, CCEA, CCNA, MCSA, EMCSE, EMCISA

mcsenerd · ‎08-16-2009

While I certainly appreciate the response Stefan, I'm really after real world...actual experience performance histories from peers. I've read every one of the links that you listed previously and I'm fully aware of the reasons to or not to use RDM in an environment, but the fact still stands that I'm not seeing the puny performance difference that's supposed to exist between the two methods...45 to 50% improvement is hard to ignore. So...that being said...I'm still after answers to my original questions if possible from folks.

Just for a matter of reference...this type of application often generates the million file scenario...I.E. - huge amounts (1TB plus) of small to very small files. An example import job in this scenario prior to using RDM was the following:

A 21GB robocopy job containing almost 85,000 files took just around 10 hours with a VMFS based VMDK...vs...when the same destination VM file server was converted to utilize RDM...the same job took only around 5 1/2 hours.

This isn't a marginal performance difference. I'm just looking to see if anyone else has seen similar outcomes in similar situations...

MauroBonder · ‎08-17-2009

performance_char_vmfs_rdm.pdf

*If you found this information useful, please consider awarding points for "Correct" or "Helpful"*

*Please, don't forget the awarding points for "helpful" and/or "correct" answers. *Por favor, não esqueça de atribuir os pontos se a resposta foi útil ou resolveu o problema.* Thank you/Obrigado

mcsenerd · ‎08-17-2009

No offense...but...I did say...

.... But what I'm really looking for right now is some real-world information. I don't need information to go lookup whitepapers on testing by someone else...I'm looking for first-hand experience. ...

AndreTheGiant · ‎08-17-2009

I'm looking for first-hand experience.

I've tried several configuration (VMFS with 8K blocks), virtual RDM, physical RDM, ...

From ESX 3.5 I do not notice relevant (more than 5%) performance difference from vmdk solutions.

Maybe with the new DirectPath feature... (but I have not tested now).

And with RDM you can loose a lot of flexibility (for example backup/snap, ...).

The only reason to use them could be a fast P2V (if data are already in SAN) or an application cluster inside VM.

Different consideration on iSCSI: depending on storage type and features, could be useful (from a feature point of view) have some disks connected with a software initiator inside the VM.

Andre

Andrew | http://about.me/amauro | http://vinfrastructure.it/ | @Andrea_Mauro

Igwah · ‎10-20-2009

Hi,

I had two MSCS Server 2003 r2 Ent x64 - SQL 2005 clusters running using VMDKs on a FC SAN (4gb HBAs) and had issues where the cluster kept failing over. The guest O/S was throwing pop ups saying "Delayed Writes Failed" and application performance wasn't anything to scream about. Everything was configured to best practice standards.

We migrated to RDMs, formatted the same way as the vmdks were and low and behold the issues went away and performance increased.

I've spoken to people who said they won't virtualise SQL without RDMs due to similar experiences. Whilst SQL isn't a file server, it is high I/O so I suppose I have had similar experiences in real world scenarios.

Interestingly performance still isn't great so now I'm going down the VSphere 4 route and trying to find out about DirectPath but I'm not having much luck finding info on it yet.

All

RDM vs. VMFS...again...