jim33boy
Contributor
Contributor

VSAN 2 Node Shutdown procedure

Hi All

I have a 2 node VSAN cluster with an external VCSA and witness appliance.

I cannot find the shutdown procedure for a 2 Node VSAN with a witness appliance.

I found the KB for shutting down a VSAN with the VCSA running on top of the cluster but it does not mention the witness appliance.

https://kb.vmware.com/s/article/2142676

Does anyone have a procedure to shutdown a 2 node cluster with an external VCSA and witness appliance?

 

Thank you  

0 Kudos
8 Replies
TheBobkin
VMware Employee
VMware Employee

@jim33boy ,It is basically just a case of powering down all VMs running on the nodes, validating that there is no ongoing resync via the vSphere Client and that all Objects are healthy via vSAN Health UI, then placing all nodes (including the Witness) in Maintenance Mode with 'No Action' option - if you want to take an additional precaution, there is an inbuilt script (if 6.7 U3 or later but can be downloaded and installed if lower) that does some other prechecks and intentionally partitions all nodes from each other (so that there is no possibility of data-state change or update):
https://kb.vmware.com/s/article/70650

0 Kudos
jim33boy
Contributor
Contributor

Thank you for the reply.

 

So in this scenario excluding the script you referenced for the moment, after placing all nodes in maintenance mode, including the witness, I could then shutdown the VSAN nodes including the witness?

And then bring down Vcenter.

On power up, is there any requirement to bring up the witness appliance before the VSAN nodes or vice versa?

 

Thanks again 

0 Kudos
TheBobkin
VMware Employee
VMware Employee

"after placing all nodes in maintenance mode, including the witness, I could then shutdown the VSAN nodes including the witness?"
Yes.

 

"And then bring down Vcenter."
This is unnecessary as you mentioned that it is 'external' by this I assumed you meant it was running and stored somewhere other than on this cluster, please clarify if this is not the case.

 

"On power up, is there any requirement to bring up the witness appliance before the VSAN nodes or vice versa?"
So, what order they are brought up is irrelevant - the important part is to validate that all 3 are up and healthy before taking any of them out of Maintenance Mode (MM) and to take them all out of MM within a reasonably short timeframe (e.g. don't take a 1 data-node + Witness out of MM and leave the other data-node still in MM for hours).

0 Kudos
jim33boy
Contributor
Contributor

Thanks again for the help.

"And then bring down Vcenter."
This is unnecessary as you mentioned that it is 'external' by this I assumed you meant it was running and stored somewhere other than on this cluster, please clarify if this is not the case.

 

Correct it is external but everything has to come down for server room maintenance. So the plan was to enter maintenance mode, shutdown the hosts and witness  from VCSA and then shutdown VCSA. The on power up, VCSA comes on, allow VCSA to initialize fully, then power up the VSAN nodes and witness within 5 minutes of each other.

 

Thanks

0 Kudos
TheBobkin
VMware Employee
VMware Employee

Okay, should be grand.

 

Just to re-iterate though - powering up the data-nodes and Witness within a few minutes of one another isn't the key part, it is taking them all out of MM within a few minutes of each other and validating that all is good with them before doing so (e.g. via vSAN Health UI, checks for cluster partition, network communication, disk operational state etc. should all be Green, Data health is expected to be Red until the nodes are taken out of MM though). 

0 Kudos
jim33boy
Contributor
Contributor

Very helpful, thank you.

 

Do I need to disable HA prior to doing any of this or is the Maintenance Mode with no action enough?

0 Kudos
TheBobkin
VMware Employee
VMware Employee

Well all the VMs will be powered off so HA won't be doing anything with them (e.g. they were not crashed or made unavailable from a specific node).

Once the data-nodes are in MM their data will not be accessible (as MM basically puts the Disk-Groups and thus the data on them in an unavailable state) and thus nothing would be able to power on/restart/re-register the VMs anyway.

 

So, nope, nothing needs to be done with HA.

0 Kudos
jim33boy
Contributor
Contributor

Fantastic, thank you very much, this is helpful. I will proceed with maintenance. 

0 Kudos