VMware Cloud Community
goofygoose
Contributor
Contributor

VMFS3 for MSCS shared storage, why not?

From the paper on configuring MSCS across physical hosts on ESX3.5, it was stated that RDM be used for the quorum and all other shared disk volumes.

I have been wondering why can't the quorum and shared storage be on vmfs3 partitions instead. I have a large number of MSCS clusters to be set up and using vmfs3 for shared storage will definately make the SAN administration a lot easier, not to mention the large number of LUNs that will be required to be configured on my storage controller and risk exceeding the maximum number of LUNs it can support.

Hence, I have set up a 2 node MSCS cluster across 2 different hosts, using a shared vmfs3 partition on a shared iscsi target. The shared vmdk files (of type eagerzeroedthick) for the quorum and shared storage were created at the service console using vmkfstool. The scsi controller for the quorum disk and shared disks is of type "LSI Logic" and "Physcial" mode.

The rest of the configurations were followed using the vmware MSCS configuration guide and the Microsoft MSCS configuration guides. The 2 guest OS are windows server 2003 enterprise editions.

The MSCS installation and configurations were successfully completed and my initial testings looks good, with the nodes being able to failover well in both planned and unplanned scenerios.

So my big question is... what's wrong with using vmfs3 for the quorum (and shared storage) instead of RDMs??

0 Kudos
32 Replies
goofygoose
Contributor
Contributor

I forgot to add that I am using ESX3.5 update 3 and vCenter 2.5.

0 Kudos
runclear
Expert
Expert

Try to take a VCB backup or a stand alone snapshot, and see what happens :smileymischief:

-------------------- What the f* is the cloud?!
0 Kudos
goofygoose
Contributor
Contributor

Ohh thanks! I am starting to see some differences between the 2 implementations!

I don't use VCB for my backups (luckily?). Correct me if i am wrong but i have the impression we cannot take a snapshot of RDM disks either?

Are there any other areas I should look out for?

0 Kudos
dclark
Enthusiast
Enthusiast

I think you can snap raw disks if you set compatibility mode to virtual....but I must admit I have never tried it....

0 Kudos
aleph0
Hot Shot
Hot Shot

This could be a little off topic... just my 2 cents on your configuration...

MSCS is not supported by ESX on iSCSI, only on FC...

cheers

\aleph0

____________________________

(in italian)

###############

If you found this information useful, please consider awarding points for "Correct" or "Helpful". Thanks!!!

\aleph0 ____________________________ http://virtualaleph.blogspot.com/ ############### If you found this information useful, please consider awarding points for "Correct" or "Helpful". Thanks!!!
0 Kudos
goofygoose
Contributor
Contributor

Please pardon me, but what is meant by "snap raw disks"?

0 Kudos
goofygoose
Contributor
Contributor

I am really curious why it isn't supported on iSCSI by VMware. Having MSCS restricted to FC RDM is just so frustrating...

I wonder if I am missing something critical with MSCS on iSCSI....

0 Kudos
aleph0
Hot Shot
Hot Shot

Could you pls tell me what services are you running on virtualized MSCS?

Thank you

\aleph0

____________________________

(in italian)

###############

If you found this information useful, please consider awarding points for "Correct" or "Helpful". Thanks!!!

\aleph0 ____________________________ http://virtualaleph.blogspot.com/ ############### If you found this information useful, please consider awarding points for "Correct" or "Helpful". Thanks!!!
0 Kudos
goofygoose
Contributor
Contributor

vCenter, Exchange, IIS, MSSQL, Oracle, Websphere.

0 Kudos
aleph0
Hot Shot
Hot Shot

why are you clustering vCenter? I think that's not supported by Vmware

do the clustered SQL manteins the VC DB?

do you trust your configuration? is it in production?

\aleph0

____________________________

(in italian)

###############

If you found this information useful, please consider awarding points for "Correct" or "Helpful". Thanks!!!

\aleph0 ____________________________ http://virtualaleph.blogspot.com/ ############### If you found this information useful, please consider awarding points for "Correct" or "Helpful". Thanks!!!
0 Kudos
goofygoose
Contributor
Contributor

You mean VMware does not support clustering vCenter although VMware has released paper on clustering vCenter (http://www.vmware.com/resources/techresources/945) and it being recognized as a "proved practice" at VIOPS (http://viops.vmware.com/home/docs/DOC-1104)?? I seriously need to go back to reading the support VMware provides.

The clustered SQL maintains the VC DB as well as for other applications.

The configuration looks fine so far. I am doing some testings and studies before concluding on its suitability for production purposes and thus greatly appreciate all feedbacks! 😃

0 Kudos
dclark
Enthusiast
Enthusiast

Sorry, I should clarify...RDM = Raw Device Mapping, which I refer to as a RAW disk, and by snap I mean "take a snapshot"

0 Kudos
aleph0
Hot Shot
Hot Shot

In my opinion:

1 virtual machine with vCenter (but not in cluster)

1 physical SQL (maybe in cluster) with VC and really Disk I/O intensive DB that's servicing your enterprise.

1 virtual SQL with non Disk I/O intensive DB.

The rest it's up to you: I will not go for clustering inside ESX: moreover next generation of ESX will provide Fault Tolerance feature for VMs that will give you business continuity instead of high availability (HA Features)

HTH

\aleph0

\aleph0 ____________________________ http://virtualaleph.blogspot.com/ ############### If you found this information useful, please consider awarding points for "Correct" or "Helpful". Thanks!!!
0 Kudos
goofygoose
Contributor
Contributor

Ohh... I didn't know that... It will be useful for my other non-clustered VMs with RDMs... I shall try that tomorrow! 😃

0 Kudos
goofygoose
Contributor
Contributor

I wish the Fault Tolerance feature get released soon! That will certainly solve a lot of my problems! But I heard that the FT feature currently supports only VMs with only 1 vCPU... that will probably pose an issue for some of my VMs...

On a side note: I like your blog! And I have been wanting to pick up PowerShell since ages ago!

0 Kudos
kjb007
Immortal
Immortal

While you can create cluster disks on VMFS storage, it is not supported. Since both cluster nodes share the disk that they mount between them, a triggered I/O to that disk could cause a cluster node to think a node is down, and the same can occur if you have extended issues with vmotion or a SAN connectivity problem. This can cause a split-brain scenario when the passive node tries to take over the disk, but then the I/O issues lapse, and the active node comes back online. Depending on the timing, you can end up with data corruption on the shared disk. This is why clustering across nodes requires a physical mode RDM, which could preclude them from a vmotion due to the physical bus sharing.

-KjB

VMware vExpert 2009

vExpert/VCP/VCAP vmwise.com / @vmwise -KjB
aleph0
Hot Shot
Hot Shot

Why are you using multiple CPU? On which service?

Do the server have applications that are SMP aware?

If not, always use 1 vCPU.

Moreover remember that the kernel to be used on Windows is selected to be 1 cpu or multiple cpus at installation time: if you install with multiple cpu kernel and revert the machine to 1 vcpu later, the VM will have overhead in execution.

Cheers

\mf

\aleph0 ____________________________ http://virtualaleph.blogspot.com/ ############### If you found this information useful, please consider awarding points for "Correct" or "Helpful". Thanks!!!
0 Kudos
goofygoose
Contributor
Contributor

Please pardon me as I am not very familiar with MSCS and storage configurations, but what is meant by "a triggered I/O" and how does RDM prevents this from occurring?

0 Kudos
goofygoose
Contributor
Contributor

All my oracle instances have at least 2 vCPUs allocated if not 4.

We have started off from 1vCPU for all the instances and have steadily increased the configurations after analyzing the performance data collected over the past 1+ year.

0 Kudos