VMware Cloud Community
Evan55
Contributor
Contributor
Jump to solution

Split and Move vSAN cluster

Hi - looking for some guidance on the following please:

6 node vSAN cluster, Dell R740xd

2 disk groups per node

NVMe cache and SSD capacity

ESXi 6.7 Update 1 fully patched

Dedicated redundant 10gig switches for vSAN and dedicated separate 10gig switches for VMs/Management

ESXi installed on dedicated Dell BOSS redundant cards

This was a brand new setup that I had one of my senior technical staff setup and deploy. He has now left the organization and I have found numerous issues with the setup, especially around the network config and I have spent the last week fixing and redoing. This is going to be our production environment so its needs to be 100%

Even after redoing the networking and fixing some other host issues and upgrading firmware and drivers to the correct levels I am plagued with random Host Connectivity issues and other issues that pop-up in the Heath Checks and other normal tasks that error out such as putting a host into maintenance mode or enabling/disabling HA or DRS. These tasks often throw errors and I'm constantly chasing my tail trying to fix. To its credit, the VMs running on the vSAN have worked flawlessly.

So I would like to start completely from scratch and do this entire cluster over again, properly.

Here is my plan, bear in mind there are already about  35 VMs running on the cluster. Many more will be migrated from the old server environment. I have halted migration of any more until this is resolved. From a capacity perspective it wont be easy to move these 35 back again so Ideally I just need to fix and move forward.

Tell me if you think any of this wont work or if I am forgetting something:

  • Put host 1 in maintenance mode > evacuate all data
  • Remove host from cluster
  • Repeat for Host 2 and 3
  • Create "vSANCluster2" (name is for example only)
  • Create new DvS switch for vSAN and one for VMs/Management
  • Remove hosts 1,2,3 from original DvS and add to new DvS
  • Add hosts 1,2,3 to vSANCluster2
  • Enable vSAN on vSANCluster2
  • Confirm all is good with no errors
  • vMotion VM's from vSANCluster 1 to vSANCluster 2

I am lucky i have 6 hosts so I can run two 3-node clusters temporarily.

Some things that concern me:

  • Will the new vSANCluster2 accept the hosts without problems? I am worried that it will see existing vSAN partitions or similar and reject the host? Should I "format" the drives first and if so, how?
  • Do you envision any problems from a HBA level running two vSAN clusters at same time? I am running Dell HBA330
  • vCenter is running in Enhanced Linked Mode with another vCenter running at another site also with a 4 node vSAN cluster, if that makes any difference.

In theory this should work, right...?

EDIT: Just to clarify - the plan is to move all hosts into the new vSANCluster2 eventually, not to run two separate clusters. I just need to do it in phases to get to the goal of one new 6-node cluster.

Tags (2)
Reply
0 Kudos
1 Solution

Accepted Solutions
vpradeep01
VMware Employee
VMware Employee
Jump to solution

Hello Evan55 ! Good day.

I would like to suggest you to perform few basic health checks prior to proceeding with the plan:

1. Consolidate the snapshots on VMs which are still running.

2. Make sure the current member count on the cluster is 6. Command to check "esxcli vsan cluster get". Member count should be 6

3. Make sure all the drives on all the hosts are currently mounted/healthy in cmmds. Command to check "esxcli vsan storage list | grep -i cmmds"

TRUE indicates all drives are mounted.

4. Make sure there are no pending resync in the cluster.

5. Confirm if all the VMs VM COmpliance state is complaint to its storage policy applied.

Action plan with few additions:

  • Put host 1 in maintenance mode > evacuate all data
  • Remove host from the vSAN cluster by running the below command: [ Hope you meant the same, would like to explicelty meantion here ]

esxcli vsan cluster leave

  • Repeat for Host 2 and 3
  • Create "vSANCluster2" (name is for example only)
  • Create new DvS switch for vSAN and one for VMs/Management
  • Remove hosts 1,2,3 from original DvS and add to new DvS
  • Add hosts 1,2,3 to vSANCluster2
  • Enable vSAN on vSANCluster2
  • Confirm all is good with no errors
  • Storage vMotion VM's from vSANCluster 1 to vSANCluster 2

I am lucky i have 6 hosts so I can run two 3-node clusters temporarily.

Ans:

The above steps should be perfect provided you have all VMs/objects/VMDKs on RAID 1 + space consumption for any largest Raid ! object/VMDK should be able to fit its mirrors on each hosts.

Some things that concern me:

1. Will the new vSANCluster2 accept the hosts without problems? I am worried that it will see existing vSAN partitions or similar and reject the host? Should I "format" the drives first and if so, how?

Ans:

There should NOT be any problem as long as we have left/removed the hosts successfully from the original vSAN Cluster by running the command

esxcli vsan cluster leave" as mentioned above after placing the host 1 in to maintenance mode with "Full evac" or Destroying the disk-groups on host 01 with "Full data evac".

2. Do you envision any problems from a HBA level running two vSAN clusters at same time? I am running Dell HBA330

Ans:

You should not face any issue with this HBA. HBA330 is a supported IO Controller for vSAN.

Just make sure you the drives presented to this controller are in pass-through/ as per the vSAN HCL.

3. vCenter is running in Enhanced Linked Mode with another vCenter running at another site also with a 4 node vSAN cluster, if that makes any difference.

Ans: This should be good.

In theory this should work, right...?

Yes, as long as all the objects are in RAID 1.

EDIT: Just to clarify - the plan is to move all hosts into the new vSANCluster2 eventually, not to run two separate clusters. I just need to do it in phases to get to the goal of one new 6-node cluster.

You plan is drafted very well. It made me easy to understand without asking you more questions.

Thanks

View solution in original post

Reply
0 Kudos
4 Replies
vpradeep01
VMware Employee
VMware Employee
Jump to solution

Hello Evan55 ! Good day.

I would like to suggest you to perform few basic health checks prior to proceeding with the plan:

1. Consolidate the snapshots on VMs which are still running.

2. Make sure the current member count on the cluster is 6. Command to check "esxcli vsan cluster get". Member count should be 6

3. Make sure all the drives on all the hosts are currently mounted/healthy in cmmds. Command to check "esxcli vsan storage list | grep -i cmmds"

TRUE indicates all drives are mounted.

4. Make sure there are no pending resync in the cluster.

5. Confirm if all the VMs VM COmpliance state is complaint to its storage policy applied.

Action plan with few additions:

  • Put host 1 in maintenance mode > evacuate all data
  • Remove host from the vSAN cluster by running the below command: [ Hope you meant the same, would like to explicelty meantion here ]

esxcli vsan cluster leave

  • Repeat for Host 2 and 3
  • Create "vSANCluster2" (name is for example only)
  • Create new DvS switch for vSAN and one for VMs/Management
  • Remove hosts 1,2,3 from original DvS and add to new DvS
  • Add hosts 1,2,3 to vSANCluster2
  • Enable vSAN on vSANCluster2
  • Confirm all is good with no errors
  • Storage vMotion VM's from vSANCluster 1 to vSANCluster 2

I am lucky i have 6 hosts so I can run two 3-node clusters temporarily.

Ans:

The above steps should be perfect provided you have all VMs/objects/VMDKs on RAID 1 + space consumption for any largest Raid ! object/VMDK should be able to fit its mirrors on each hosts.

Some things that concern me:

1. Will the new vSANCluster2 accept the hosts without problems? I am worried that it will see existing vSAN partitions or similar and reject the host? Should I "format" the drives first and if so, how?

Ans:

There should NOT be any problem as long as we have left/removed the hosts successfully from the original vSAN Cluster by running the command

esxcli vsan cluster leave" as mentioned above after placing the host 1 in to maintenance mode with "Full evac" or Destroying the disk-groups on host 01 with "Full data evac".

2. Do you envision any problems from a HBA level running two vSAN clusters at same time? I am running Dell HBA330

Ans:

You should not face any issue with this HBA. HBA330 is a supported IO Controller for vSAN.

Just make sure you the drives presented to this controller are in pass-through/ as per the vSAN HCL.

3. vCenter is running in Enhanced Linked Mode with another vCenter running at another site also with a 4 node vSAN cluster, if that makes any difference.

Ans: This should be good.

In theory this should work, right...?

Yes, as long as all the objects are in RAID 1.

EDIT: Just to clarify - the plan is to move all hosts into the new vSANCluster2 eventually, not to run two separate clusters. I just need to do it in phases to get to the goal of one new 6-node cluster.

You plan is drafted very well. It made me easy to understand without asking you more questions.

Thanks

Reply
0 Kudos
Evan55
Contributor
Contributor
Jump to solution

Thanks vpradeep01 - this is a very helpful and detailed reply, its really much appreciated.

I am busy evacuating data from the three hosts now and then will continue as you suggested.

Thanks again.

Reply
0 Kudos
vpradeep01
VMware Employee
VMware Employee
Jump to solution

Welcome Evan55

Reply
0 Kudos
rastickland
Enthusiast
Enthusiast
Jump to solution

this is something that I am going to have to do in the very near future.  the only challenge that this one has for me is that my vsan cluster is only 4 nodes and will be moving to a new vcenter all together.  My question is:  can the same steps be applied with splitting the vsan into two, two node vcenters?

Reply
0 Kudos