VMware Cloud Community
Cccb16
Contributor
Contributor

Maximum hosts in maintenance mode - vSAN Stretched

Hello!!

I explain my doubt.

Environment:

preferred site: 10 hosts

secondary site: 10 hosts

witness site

Site disaster tolerance: Dual site mirroring

Failures to tolerate: RAID-5 (Erasure Coding)

How many hosts can be in maintenance mode in the same time without have risk?, I know that all hosts of one site could be in maintenance mode but if one host from the other site will fail,could have we problems?

Whats is it the best practices for maintenance mode in Stretched Cluster? only one host in the same time?

Thank you very much in advance.

Reply
0 Kudos
3 Replies
TheBobkin
Champion
Champion

Hello Cccb16​,

Welcome to Communities.

"How many hosts can be in maintenance mode in the same time without have risk?"

This depends entirely on what you mean by "risk" - generally I would consider data being at risk if it has no redundancy for a prolonged period.

What is the intended purpose of placing the nodes in Maintenance Mode?

For example, are you performing updates/patches that can be done one node at a time or is it some site maintenance that requires an entire site shut down and/or for a relatively prolonged period (e.g. 10+hrs/days)?

"I know that all hosts of one site could be in maintenance mode but if one host from the other site will fail,could have we problems?"

So, with a Storage Policy of PFTT=1, SFTT=1 LocalFTM=RAID5, this is essentially a RAID1 (across sites) of RAID5s - this means an entire site can be shut down/in MM and the data will still be FTT=1 on the remaining site, this means the data will still be available if a subsequent single node/disk/Disk-Group failure occurs (though then it will be FTT=0 until it has repaired it back to FTT=1).

"Whats is it the best practices for maintenance mode in Stretched Cluster? only one host in the same time?"

As I alluded to above - it depends what you are doing - if a whole site needs to be shut down for maintenance, then this is the very reason you have PFTT=1 and SFTT=1, but also if you wanted to update a whole site at a time then this is fine provided your data is adequately protected and you understand the implications of doing it this way as opposed to one at a time.

Bob

Reply
0 Kudos
Cccb16
Contributor
Contributor

Hello,


The main reason is performing update of firmware, drivers and the upgrading the esxi version. So, I would like to do this secure manner and the shortest time possible. I used to do one host in the same time but maybe there is other faster way.

Thank you very much for you answer.

Reply
0 Kudos
TheBobkin
Champion
Champion

Hello Cccb16

Provided all the data is indeed using a Storage Policy with dual-site local-protection then patching one site at a time (and don't forget the Witness!) should be fine - I see customers doing this regularly without issues but as with any maintenance (on vSAN or otherwise), ensure you have a good and current set of backups before you begin.

I would advise confirming that all Objects have a dual-site policy by looking at the data generated from 'esxcli vsan debug object list' (with flag '--all' if on 6.7 U3 or later) - if you PM me this I can take a look.

Bob

Reply
0 Kudos