John4321
Contributor
Contributor

How to deploy Microsoft Cluster Service/Failover Cluster on VMware VMs - advice appreciated.

Jump to solution

Hi

I wish to deploy 2 Microsoft File servers (VMs) that share the same storage.

  • The reason is to provide resilience and high availability.
    • Resilience in that if one of the file servers fails, the other will be able to read/write to the storage.
    • High availability so that we can take one of the servers out of production to apply windows updates and any other maintenance procedures.

HA/Resilience of the data storage itself will be handled by another DR solution.

We also want the solution to be relatively simple.

The simplest solution I can see is to deploy a failover cluster in Windows.

And to do this is to "Cluster Virtual Machines on One Physical Host".

The instructions for VMware are fairly clear, but I have questions which I can't find the answer to.

Any advice or answers people might have will be much appreciated.

1. When setting up the Failover Cluster in Windows, where is the "private" network used?

2. I believe the same vmdk is attached to both VMs.

  • When setting up the Failover Cluster in Windows, does it recognise the drive on both VMs (vmdk) as the same disk in Failover Cluster Manager?

3. Can you read/write to the disk at the same time on both file servers? (I can see that the disk is set to multi-write).

  • I'm not suggesting that we need to, I wondered if this was a limitation.

4. For the "Cluster-in-a-box" solution, vmotion is not supported/possible.

  • What happens if the host that the VMs are on goes down, will the VMs be brought up on another host?
    • I'm not sure if this comes under vmotion or not.
    • I believe that they are brought up on another host as the guide states "Limit the number of hosts to two when you define host DRS group rules for a cluster of virtual machines on one physical host".

5. Is it not possible to snapshot the shared disk as it is "Independent-Persistent"?

I'm hoping that somebody has already setup this environment and will be able to guide me.

I will be setting up the environment in our testing environment but I don't want to re-invent the wheel twice!

Thanks

John

1 Solution

Accepted Solutions
ReginaldoT
Contributor
Contributor

1. Private network is used for WSFC Heartbeat. You should check recent WSFC topologies to see if its beeing used on earlier versions of windows

2 and 3. That is correct! the disks are shared between both VMs. This does not allow the volume to be written from both cluster nodes at the same time, since the WSFC is an Active-passive cluster mode. The Windows Cluster Disk Driver reserves the disk on the passive node and makes it R/W on the active node

4. vMotion is not supported because of the SCSI Bus Sharing. The "if the host goes down" scenario is a High Availability response (not vMotion). Both nodes will powered on in another healthy host

5. Independent Persisistent disks cannot be snapshoted.

View solution in original post

6 Replies
IRIX201110141
Champion
Champion

1. You have to create  another VM Portgroup on your vSwitch and add a 2nd. vNIC to the VMs and setup a new IP Subnet.

5. Correct. No Snapshot will be created for those type of vDisks.

Regards,

Joerg

ReginaldoT
Contributor
Contributor

1. Private network is used for WSFC Heartbeat. You should check recent WSFC topologies to see if its beeing used on earlier versions of windows

2 and 3. That is correct! the disks are shared between both VMs. This does not allow the volume to be written from both cluster nodes at the same time, since the WSFC is an Active-passive cluster mode. The Windows Cluster Disk Driver reserves the disk on the passive node and makes it R/W on the active node

4. vMotion is not supported because of the SCSI Bus Sharing. The "if the host goes down" scenario is a High Availability response (not vMotion). Both nodes will powered on in another healthy host

5. Independent Persisistent disks cannot be snapshoted.

John4321
Contributor
Contributor

Hi

A follow up question which hopefully you will be able to help me with.

I have assumed that a disk that is being shared by 2 VMs and is being using by Microsoft Failover Cluster requires:

  • Disk Provisioning = Thick provision eager zero
  • Sharing = Multi-Writer
  • Virtual Device Node = separate SCSI Controller
  • Disk Mode = Independent - Persistent

However, I realised that I got these requirements from this site:

When I look at the following VMware sites:

It only seems to talk about having the disk in "eagerzeroedthick" format, it doesn't mention "Independent - Persistent" or "Multi-Writer".

Site 1
Site 2
Image1.jpgImage2.jpg

Does the disk need to be "Independent - Persistent" and "Multi-Writer"?

I will test this out on the laboratory environment but I thought it would be good to get your input if you knew.

I'm interested in this as we use Veeam to do a server level backup and so having a persistent disk prevents this.

Thank you.

John

0 Kudos
ReginaldoT
Contributor
Contributor

you dont need to worry about disk persistency or multi-writer flag for MSCS deployments

1. Create a second SCSI controller on node 1

2. Enable virtual bus sharing

3. Add thick provisioned eager zeroed disks to serve as clustered volumes on the controller you just added

4. Create a second SCSI controller on node 2

5 Enable bus sharing too!

6. add the disks you just created on node 2 (on the new controller aswell)

7. Create the afinity rule to power both vms on the same hosts

8. Power on both nodes and setup your MSCS cluster

about the backup: The MSCS setup doesnt let you take snapshot backups. As a workaround on that, you should use file system level backups and/or OS system state backups

John4321
Contributor
Contributor

Hi

That is very helpful. Out of interest, why doesn't the MSCS setup allow you to take snapshot backups? Is it because a snapshot can't be applied (and if so, why?) or is it that it is not recommended?

I have created the setup on the laboratory environment using the multi-write/persistent disk and it worked very well. I will now test it without the multi-write/persistent attributes.

Once again, thank you for all your help. It has been extremely useful.


John

0 Kudos
ReginaldoT
Contributor
Contributor

its correct. The MSCS does not Support snapshots because of bus-sharing, so it does not take advantage of image snapshots. On the other hand, on 6.5 and later, vmotion of bus-sharing scsi controllers are suporte (on MSCS you have to do a little bit of disk timeout tunning)

Never used MSCS/WSFC with multiwritter flag. I think this setup is only supported on Oracle RAC builds.