VMware Cloud Community
Brainbugg
Contributor
Contributor

How to replace a failed server in a vSAN cluster

Hi There

What steps must be followed to rebuild and replace a failed host in a 4 node vSAN cluster if ESXi needs to be rebuilt and should the disks that are part of the vSAN cluster be deleted before joining the cluster again?

Do the steps differ for a vSAN 6.6 and 6.7 cluster?

I followed the steps in the article below but this doesn't seem to work and when I add the server back, it doesn't join the existing vSAN cluster but instead creates a new Cluster consisting of only 1 node.

https://nolabnoparty.com/en/replace-failed-host-vsan-cluster/

Thanks

Regards

0 Kudos
1 Reply
TheBobkin
Champion
Champion

I cannot see where in that blog article that they say to add the re-installed host back into the original vSphere cluster in the vCenter inventory - in 6.6 and later vCenter is responsible for pushing down membership changes based on the vSphere cluster members so if you don't add the node to this then regardless of whether you try to join the cluster manually via CLI, the node won't have unicast list it needs to communicate with the other nodes and the other nodes won't have unicast entry for the 'new' node.

Follow the steps here:

VMware Knowledge Base

If it is still partitioned then either manually populate the unicast lists or click the 'vCenter is Authorititive' button in the Health UI

VMware Knowledge Base

Edit: No you don't need to delete the vSAN partitions from the disks and it is preferable that you don't do this.

Bob

0 Kudos