VMware Cloud Community
Tim_1
Contributor
Contributor

vMotion of a Windows Server Failover Cluster (WSFC) VM with a Cluster in a Box (CIB) Shared VMFS Disk when doing ESXi Host maintenance

Background: 

  • VMware 6.0U2 on ESXi/vCenter - planning upgrade to 6.0 U3 or to 6.5 possibly in 3-6 months.  HPE Proliant Blade Servers all reaching back to Dell Compellent SAN.  
  • My backup solution is HPE Data Protector which has a physical agent and can copy Physical RDM without issue.  I am considering VEAAM or similar in the future but am concerned that they only leverage VM snapshots on VMDKs and won't be able to handle a Physical RDM that would be used for a WSFC with a Cluster Across Boxes that is the Approved Solution that allows vMotion. 
  • For VM level Operating System maintenance I'd like to setup WSFC to keep my file and SQL DB's up at all times and not have to deal with an outage for server patching for those services.
  • The VM level backups are less disruptive to my file servers/SQL servers than the physical agent initiating a Volume Shadow Copy to pull a backup on a 1TB+ volume.
  • I don't know everything and am open to suggestions/advice/best practices for getting a balance between service up-time/DR/ongoing maintenance

My current environment has No Clustering with multiple SQL servers and file servers running independently as single points of failure and the VMs takes an outage when patching the OS.  So the idea of going to a CIB setup and being able to use VM level backups exclusively and not take an outage for patching would be some nice incentives.  However ESXi Host Maintenance looks like it could require an outage:

  

WSFC points...No ClusteringCluster in a BoxCluster Across Boxes
vMotionYesNoYes
Disk requirementNoneVMDK OnlyPRDM on shared Disks
VM level backup and restoration abilityYes: if VMDKYesNo
Outage: OS - Failure or PatchingYesNoNo
Outage: ESXi Failure or MaintenanceNoYesNo

My question:

If running WSFC in a Cluster in a Box mode with 2 server nodes and wanting to do ESXi host maintenance, would it be possible to put 1 of the server nodes in the cluster in maintenance, shut down the VM in maintenance mode then vMotion both VMs of the cluster to a different host? 

I'm assuming the disk Timeout Value has been increased in the OS of the cluster VMs like in the CAB requirement.  You are out of High Availability for a bit of time, but don't have to take an outage to do ESXi Host maintenance. 

TIA

0 Kudos
1 Reply
parmarr
VMware Employee
VMware Employee

Hello,

My understanding is you'll need to move all the VM's over to the other server using vMotion and then perform the maintenance. Also, you may see the existing thread (it is old but just may give you an idea) Understanding Esxi Maintenance mode and Standby mode

Sincerely, Rahul Parmar VMware Support Moderator
0 Kudos