VMware Cloud Community
Odurasler2
Enthusiast
Enthusiast

Microsoft Clustering and RDM

Is anyone running MSCS across two ESX Hosts with an EMC Clariion backend successfully?

I've been having a hard time trying to get MSCS to work using RDM's. I followed the VMware documentation step by step and I'm having intermittent issue when I perform the MSCS Validation Wizard on the Storage section. The problem may be due to the way the RAW LUNS are being presented to the ESX hosts. It was pointed out that I should move from a one "Storage Group" per host to a one Storage Group for all my ESX servers so that they can be consistent throughout. I still had the same issue after moving my clustered ESX farm to a single Storage Group within the clariion's navisphere.

One of the errors I get during the MSCS Validation Wizard is the "Validate Disk Arbitration" test. I get "Failed to release cluster disk 2 from node ServerA, failure reason: The requested resource is in use."

Also, I read in one of the forums that MSCS does not support multipathing such as PowerPath/VE which is what we have running. Could someone validate whether this is true or not?

0 Kudos
11 Replies
RParker
Immortal
Immortal

The problem may be due to the way the RAW LUNS are being presented to the ESX hosts.

RDM's are not presented to the hosts.. they are presented to the VM's..hence RAW data. The ESX hosts should NOT have access to these volume or LUN that may be the problem.

It's raw storage, your VM's use whatever method via iSCSI or Fibre to access the data.

0 Kudos
Odurasler2
Enthusiast
Enthusiast

"The ESX hosts should NOT have access to these volume or LUN that may be the problem."

Not sure what you mean by that comment. The ESX hosts do see the LUNs that are shared by the clustered VMs. And yes, VM's access the RAW LUN via RDM that was created during the configuration of the VMs.

I guess to elaborate more on my first post, I had an issue where both ESX hosts that were hosting the clustered VMs had to have the same LUN ID or Runtime NAME of the LUNs that are being shared by the clustered VMs. If they are not the same, one of the VM's would not power up due to a 'lock issue.'

0 Kudos
RParker
Immortal
Immortal

The ESX hosts do see the LUNs that are shared by the clustered VMs.

Exactly, you did it WRONG.

The ESX hosts should NOT see the LUN's, should NOT be visible. NOT as in ONLY the VM's should see RAW storage.

Hint: The VM's do NOT need to have ESX host access for the VM's to DIRECTLY connect to the RDM. The VM's ONLY need HBA from the ESX to be visible, but ESX should not have any type of read/write access to the VM's RDM.

The second you ADD ESX to the cluster and ESX can access the storage, they will TRY and use it.. that's why you are having trouble and why you get the message (in use).

0 Kudos
Odurasler2
Enthusiast
Enthusiast

RParker,

First of all, thanks for all of your reply.

I'm trying to wrap my head around your comments, and I think I may be using the wrong words in my comments. You are right that my ESX hosts should not see the LUNs, which they do not. For example, if I were to go and add a storage (vmfs), I do not see those LUNs.

Maybe I should have worded it like this: "The raw LUNs are added into my "ESX Cluster Storage Group" within Navisphere."

Hopefully this clarifies things..if not let me know.

0 Kudos
Cheride
Contributor
Contributor

I do not know what exactly your problem is, but for sure if you want to present those RAW LUNS to your VM's as RDM's it should be visible to the ESX server. That means, when you  go to Add storage on ESX server, you should be seeing those RAW LUNS. You should never convert them to VMFS datastores but used them as RDM when you build the VM's.

If your physical hardware ( bare metal) cannot see the LUNS, how do the VM's running on it will see these LUNS.

0 Kudos
mcowger
Immortal
Immortal

Mr Parker is incorrect here.

RDMs MUST be presented to the ESX hosts on which the VMs will reside, and then assigned to the VMs via the regular RDM management tools.

--Matt VCDX #52 blog.cowger.us
0 Kudos
SurfControl
Enthusiast
Enthusiast

is the disk configured as physical?

is the scsi controller configured as physical?

Is the disk on a separated scsi channel other than the os disk?

0 Kudos
NuggetGTR
VMware Employee
VMware Employee

One thing i find most people forget is when creating the virtual machine to tick the box that says its going to be for clustering or fault tolerance but If you have followed the MSCS setup for vmware side to the letter could be a OS issue

Are you using windows 2008?

if you open up disk management does it say reserved on the cluster disks?

if so Sounds like stale PR....easiest is to use cluster.exe CLI to try clearing the PR. Here is a sample command

cluster node w2k8-cl1 /clear:3

3 = the number of the disk as seen in the disk Management interface.

You can also re-run all the validation storage tests ensuring you clear the check box in the wizard allowing the validation to run against all storage. Part of the validation process for storage is to clear all PRs on the drives.

re run the validation again and it might pass

Apparantely it's normal for Windows 2008 disks to show up as reserved, according to Microsoft.

cheers

________________________________________ Blog: http://virtualiseme.net.au VCDX #201 Author of Mastering vRealize Operations Manager
0 Kudos
Odurasler2
Enthusiast
Enthusiast

Hi everyone and thank you for all your replies.

I forgot to update this thread, so I'll go ahead and provide suome update.

My problem has been resolved! As I stated earlier on, I read that you can't use multipathing when using clustering with RDM's. Sure enough, the LUN's that were presented as "RAW LUN's" were using powerpath. After looking at a vmware KB article, I excluded those LUN's from using PowerPath to just the MRU. I also put all of my ESX servers under one "Storage Group" in Clariion's Navisphere so that they will all see the same LUN ID, etc. Once i performed these tasks, I was able to get Windows 2008 R2 with SQL 2008 clustering to work.

thanks again everyone for the replies.

0 Kudos
Cheride
Contributor
Contributor

Great. I have one additional question and I hope to see some responses.

Is there a proven technology available to create a WIN2k8 R2 failover cluster in ESX 4.0 without using RDM’s?

I need high availability, hence I have to keep my Microsoft cluster nodes ( VM’s) on separate ESX hosts in the same HA cluster.

If you ask why this is required, the answer is, below.

We use Vizioncore vReplicator to replicate our VM’s and unfortunately at this point there is no solution to replicate RDM’s to DR site.

Thanks

Deepu

0 Kudos
mcowger
Immortal
Immortal

Nope - if you want an MSCS cluster, the only proven and supported way is with RDMs.

--Matt VCDX #52 blog.cowger.us
0 Kudos