I'm getting "Component metadata health = invalid state" warnings for three objects in my VSAN cluster.
When running "vsan.cmmds_find" in RVC, the data for these objects shows up blank:
/x.x.x.x/datacenter/computers> vsan.cmmds_find 0 -u e17bcc57-50b8-1a27-e0a6-288023b04018
+---+------+------+-------+--------+---------+
| # | Type | UUID | Owner | Health | Content |
+---+------+------+-------+--------+---------+
+---+------+------+-------+--------+---------+
/x.x.x.x/datacenter/computers> vsan.cmmds_find 0 -u b708cf57-4870-11d0-0196-288023b046b0
+---+------+------+-------+--------+---------+
| # | Type | UUID | Owner | Health | Content |
+---+------+------+-------+--------+---------+
+---+------+------+-------+--------+---------+
/x.x.x.x/datacenter/computers> vsan.cmmds_find 0 -u 1348cb57-5f3b-2108-a2d4-288023b047f4
+---+------+------+-------+--------+---------+
| # | Type | UUID | Owner | Health | Content |
+---+------+------+-------+--------+---------+
+---+------+------+-------+--------+---------+
I have a ticket open with support and was told that these objects are no longer present in VSAN. The solution that was recommended was to delete all disk groups on the two affected hosts using "Full data migration" option then re-create the disk groups. This solution seems quite heavy-handed to me and would result in significant disk activity during data migration as well as during re-balancing once these hosts are again contributing full disk capacity.
Is there any way of determining the specific disk group that these non-present objects were a member of? I'd prefer to delete/recreate individual disk groups if possible, rather than all disk groups on these two hosts.
Thanks,
Matt
Good morning, this may be a known bug. VSAN health check - component metadata health Thank you, Zach.
KB article for it here: https://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=21453...
More than likely you will have to evacuate/destroy/recreate the affected disk group(s). I ran into this issue, also had a support case open, end result was recreate 6 disks groups with > 100TB of data being moved, took about a week.
I'm certainly prepared to destroy/recreate the disk groups, but I would prefer not to destroy/recreate ALL of the disk groups on the two affected hosts.
My problem is the object UUIDs that are listed in the Health check are no longer present in VSAN, so "vsan.cmmds_find" comes up empty.
Hoping someone can suggest a method of finding which disk groups contained these missing objects. Maybe VSAN logs I can dig through that would reference these objects and their respective disk groups?
I had the same issue (where cmmds_find and objtool would not find the UUID/object), resolution was to recreate the disk groups. My case was a few months ago, I guess you can wait and see if they have any other resolution available now.