VMware Cloud Community
saswatipatanaik
Contributor
Contributor
Jump to solution

VSAN crashed. Invalid VMs

I have 4 hosts(host No. 2,No.3,No.4,No.5) and 2 Vcenters(Mgmt and Tenant Vecnters) under 1 PSC. VSAN is configured on all 4 hosts. I migrated 2 physical hosts (No.4 and No.5)to management post which I lost connection to DNS and web client which were running on vsandatacenter on Host No.3. Host 3 and Host 2 are in tenant Vcenter.

Is there a way to bring atleast the DNS up? Or better, if we could add all hosts to the vsan cluster like before?

I followed kb articles below, but results are not working:

https://kb.vmware.com/s/article/2059091 -- to add the host(4) and (5) to the vsan cluster

http://woshub.com/invalid-state-virtual-machine-vmware-esxi/  --to reload the invalid vm

Command output from host(3):

[root@esxi20-3:~] vim-cmd /vmsvc/getallvms

Skipping invalid VM '12'

Skipping invalid VM '13'

Skipping invalid VM '14'

Skipping invalid VM '5'

Vmid   Name   File   Guest OS   Version   Annotation

[root@esxi20-3:~] vim-cmd vmsvc/reload 12

[root@esxi20-3:~] vim-cmd vmsvc/reload 13

[root@esxi20-3:~] vim-cmd vmsvc/reload 14

[root@esxi20-3:~] vim-cmd vmsvc/reload 5

[root@esxi20-3:~]

[root@esxi20-3:~]

[root@esxi20-3:~] vim-cmd /vmsvc/getallvms

Skipping invalid VM '12'

Skipping invalid VM '13'

Skipping invalid VM '14'

Skipping invalid VM '5'

1 Solution

Accepted Solutions
TheBobkin
Champion
Champion
Jump to solution

Hello saswatipatanaik​,

"Is there a way to bring atleast the DNS up? Or better, if we could add all hosts to the vsan cluster like before?"

Your VMs are inaccessible because they lost quorum e.g. majority of components are unavailable due to pulling half of the clusters storage out (as these are distributed and stored on the local storage of each nodes).

From your other post I am going to make the assumption that you are running vSAN 6.6/6.7 - if so then just adding host back to cluster using 'esxcli vsan cluster join -u <ClusterUUID>' will likely not be sufficient as the unicast entries on each node probably got changes pushed down when you removed the nodes from the original cluster.

Check all the unicastagent list entries on all nodes and manually add entries to them if each node does not contain entries for all other nodes (DO NOT add a nodes own entry to its table - this can cause PSODs):

VMware Knowledge Base

Bob

View solution in original post

0 Kudos
3 Replies
TheBobkin
Champion
Champion
Jump to solution

Hello saswatipatanaik​,

"Is there a way to bring atleast the DNS up? Or better, if we could add all hosts to the vsan cluster like before?"

Your VMs are inaccessible because they lost quorum e.g. majority of components are unavailable due to pulling half of the clusters storage out (as these are distributed and stored on the local storage of each nodes).

From your other post I am going to make the assumption that you are running vSAN 6.6/6.7 - if so then just adding host back to cluster using 'esxcli vsan cluster join -u <ClusterUUID>' will likely not be sufficient as the unicast entries on each node probably got changes pushed down when you removed the nodes from the original cluster.

Check all the unicastagent list entries on all nodes and manually add entries to them if each node does not contain entries for all other nodes (DO NOT add a nodes own entry to its table - this can cause PSODs):

VMware Knowledge Base

Bob

0 Kudos
saswatipatanaik
Contributor
Contributor
Jump to solution

Thanks Bob .. That worked.. !!!

0 Kudos
TheBobkin
Champion
Champion
Jump to solution

Hello saswatipatanaik​,

Happy to hear(but not surprised :smileycool:) that fixed it .

Just for future reference - always exercise caution when moving vSAN nodes from one vCenter to another - vCenter uses the vSphere inventory cluster object members to determine vSAN-level cluster membership, so all nodes in a vSAN cluster need to be in the same vSphere-level cluster and if at any point this needs to change (e.g. migrating whole cluster to new vCenter or new vSphere cluster object) follow the necessary steps to ensure cluster does not get partitioned:

VMware Knowledge Base

Bob