VMware Cloud Community
hrp
Contributor
Contributor

rebuilding a failed host in vSAN 4 node cluster

Looking for some guidance on the best method to remove a problematic host from a 4-node vSAN cluster so that I can reinstall ESXi and rebuild the the host config, before rejoining to the vSAN cluster.

The host in question has some issues in the boot file system and I am finding it impossible to apply updates.  VMware support engineer advises to rebuild the host, and then rejoin it to the cluster.

In a basic non-vSAN cluster, this would be a fairly routine procedure, but I am a little apprehensive about removing a node from the vSAN cluster, and am looking for a little reassurance.  Is it as simple as evacuating all of the VMs and vmdk files from the affected host, placing it in maintenance mode, and then removing it from the cluster and from vSphere inventory?

Suggestions?

Thanks in advance

1 Reply
TheBobkin
Champion
Champion

Hello hrp​,

You have two options here, you can either do:

Full Data Migration:

Put host in Maintenance Mode with Full Data Evacuation of the node - provided you have adequate remaining space and not using a RAID5 Storage Policy on any Objects as this requires 4 available nodes - Default vSAN Storage Policy is FTT=1 which only requires 3 available nodes.

Remove host from cluster and re-install ESXi.

Or

Reduced Redundancy

Ensure you take good back-ups of all data, put the host in Maintence Mode with 'Ensure Accessibility' (VMs will have no protection from failure until this host is back) and then re-install the ESXi host - this doesn't require emptying the drives.

Place a Member of vSAN Cluster in Maintenance Mode

Notes on re-installing boot medium:

https://kb.vmware.com/s/article/2059091

You don't necessarily need to remove it from vSphere inventory.

Bob

Reply
0 Kudos