VMware Cloud Community
bastaramus
Contributor
Contributor
Jump to solution

Unhealthy disk(s) are used by vSAN. But this disk and host are absent

I have a strange issue with my VSAN.

There are 3 servers in the cluster. Two of them have disk format version 7 and another one has version 5. When I tried to upgrade disk format version, I got the next error:

Unhealthy disk(s) 522e6640-06e2-6e3f-ad08-af84f26abf2e are used by vSAN.

The most interesting that there is no the owner of this UUID in my vsan.


I can see this object via "esxcli vsan debug object list"  command:

UUID: 522e6640-06e2-6e3f-ad08-af84f26abf2e

   Name: naa.57c3548172844c9f

   Owner: mi204.local

   Version: 7

   Disk Group: 522e6640-06e2-6e3f-ad08-af84f26abf2e

   Disk Tier: Cache

   SSD: false

   In Cmmds: true

   In Vsi: false

   Model: N/A

   Encryption: false

   Deduplication: false

   Dedup Ratio: N/A

   Overall Health: (PDL)

   Metadata Health: N/A

   Operational Health: N/A

   Congestion Health:

         State: N/A

         Congestion Value: 0

         Congestion Area: N/A

         All Congestion Fields:

Also, I found this object and its owner via the RVC command "vsan.cmmds_find -u 522e6640-06e2-6e3f-ad08-af84f26abf2e". The status is unhealthy:
0a112-fbd8ceba-badb-48dc-a0af-f2fba9d8121c.png

But the owner of this object is absent. I removed server with thus UUID 3 weeks ago, so I cant login to it via ssh and remove the inacessible object via objtool.

There is unhealthy status of this owner:

0a112-732d4fc2-1dc0-44c3-b55e-52792bf928cc.png

I've tried to remove it from another server, but got message that object not found:

grabilla.Uh9740.png

Do you have any idea how to remove unhealthy inacessible object and/or host? And how to resolve this issue?

Thank you very much!

Btw, I have restarted vcenter server many times. The issue still exists.

0 Kudos
1 Solution

Accepted Solutions
TheBobkin
Champion
Champion
Jump to solution

Hello bastaramus​,

These are CMMDS entries not DOM-Objects and thus it is completely expected behaviour to not be able to remove these using objtool.

These can be removed using cmmds-tool delete option but I must emphasise that extreme caution must be applied when doing such things (e.g. don't delete entries for any host or any entry belonging to it that has any possibility of coming back, don't even consider using this for removing any entries relating to DOM or LSOM-Objects as they won't remove the data just remove the CMMDS describing them etc.).

Bob

View solution in original post

0 Kudos
4 Replies
TheBobkin
Champion
Champion
Jump to solution

Hello bastaramus​,

These are CMMDS entries not DOM-Objects and thus it is completely expected behaviour to not be able to remove these using objtool.

These can be removed using cmmds-tool delete option but I must emphasise that extreme caution must be applied when doing such things (e.g. don't delete entries for any host or any entry belonging to it that has any possibility of coming back, don't even consider using this for removing any entries relating to DOM or LSOM-Objects as they won't remove the data just remove the CMMDS describing them etc.).

Bob

0 Kudos
bastaramus
Contributor
Contributor
Jump to solution

Thank you TheBobkin​!

I've used cmmds-tool to remove this object (it was a disk group from an unavailable old server) and the error gone.

Also, there are some CMMDS entries with unhealty status. For example, hosts that I had removed previously. Those hosts will never come back and they are not present in the cluster or in the vcenter infrastacture at all. TheBobkin​ What do you think, should I remove it via cmmds-tool or leave it as it is?

0 Kudos
TheBobkin
Champion
Champion
Jump to solution

Happy to help.

I must repeat though for anyone who happens across this post in future - please don't go messing with cmmds-tool unless you know what you are doing and the problem is clear, if in doubt then please just call us at GSS.

Unhealthy references to old nodes can be fairly common as a result of nodes getting re-imaged following boot device failure etc. - You can remove them if you like but the only time I have seen these cause any type of issues is if they are related to Witnesses (which have extra entry types such as PREFERRED_FAULT_DOMAIN etc.), I have yet to encounter any negative impact of leftover normal NODE references.

Bob

0 Kudos
vmkfix-SSA
Contributor
Contributor
Jump to solution

0 Kudos