VMware Cloud Community
JeremeyWise
Enthusiast
Enthusiast

Force Delete Partition

 

I posted this a year or so ago.. but dug around and saw nothing changed..

 

Without rebooting. ... that is NOT a fix 😛   And is a last resort to me.

 

How do you force removal of partition.  I am trying to remediate orphaned vSAN blocks..  From host or disk which are working... but somewhere along the life of the cluster have some "objects" which or in distress.   I took a node and put into maintenance mode with full migration..  no data on vSAN.. I now simply want to remove the partitions from the system so I can re-consume them as new vSAN disk objects.

My hack around is to boot live Linux USB system. run "wipefs" on disk and then reboot esxi.  But that really is a hack. 

 

Is there any means to unlock disk (lsof .)  where  IO lock on disk can be detected, released, and that disk partition removed.

 

 


Nerd needing coffee
Reply
0 Kudos
4 Replies
JeremeyWise
Enthusiast
Enthusiast

 

 

More updates.  Trying to get vSAN fully "heathy and upgraded"

 

 

 

Steps(after upgrade of esxi and vCenter to version 😎

 

 

 

1) Put each server in maintnence mode with data migration (if allowed) but most this failed so did it without migration

 

2) reboot host (this seems to get some locks / services stuck on disk issue to let go)

 

3) Remove disk

 

4) Remove Partition (this is needed , as whenever you just add it back , even when listed as available, it just gots back into error state)

 

5) Add disk back into disk group, or .. for one host I had to create a new disk group and move disk over to that pre above steps... but this assumes you have cache/ capacity disk to do that moving around

 

 

 

Now all disk for all servers show heathy.. and each single disk group for one fault domain for each server is listed as "heathy".

 

JeremeyWise_0-1675010184814.png

 

But this leaves me with last issue.  I still cannot clear out what I see as "ghost objects"

JeremeyWise_1-1675010238668.png

 

 

And that .. is what I think is stopping the upgrade of format on disk for vSAN to be fully heathy.

JeremeyWise_2-1675010312443.png

 

So I still in the same situation.  I cannot clear out these ghost objects.   Last two times I got to this stage the only fix was to delete the entire vSAN structure and rebuild from scratch.  But there needs to be a means to remediate these lost objects.   If you track down placement.. the disk they are are "heathy" so not sure what is step to fix the issue.

 

 

 

 

 

 


Nerd needing coffee
Reply
0 Kudos
TheBobkin
Champion
Champion

@JeremeyWise It is unclear whether you are asking about deleting vSAN partitions off disks or deleting objects (which are likely as a result of deleting disks with host in MM with 'No Action' option and/or dragged from previous cluster).

 

vSAN disk partitions don't need Linux live-boot to wipe - these can be wiped via the vSphere Disk Management UI or they can the disks can be erased from the vSphere/Host client 'erase partitions' utility or as a last measure using dd on the first 50MB + partedUtil mklabel .

 

With regard to inaccessible objects - these are probably still being clung to by vSAN due to having remaining components still on disk - if you are positive you don't need any of these (e.g. if this is a test-lab and no important VM non-functional/missing) then these can be permanently deleted using objtool e.g.:

/usr/lib/vmware/osfs/bin/objtool delete -u <UUID> -f -v10

 

Reply
0 Kudos
JeremeyWise
Enthusiast
Enthusiast

I should have split post as this was first issue about volume not able to clear partition..  It is more about it having a lock of some ilk that "reboot fixes"  Which is very not desirable.

But the real end goal is a clean vSAN updated version.   And that lead down other two rat holes

 

1) Orphaned objects

2) Upgrade Disk to latest version (failing due to object issues.

 

Is there a means after running a single object delete to validate replica state .. or do some form of check validation?

The thought in replication to disk groups / server pools is N+X  where X in my design is +1   

Ex: of first object in list with issue

JeremeyWise_0-1675048451900.png

 

d48fd262-e480-2ab0-5434-a0423f35e8ee

The object is on one server that is live, heathy, and the physical disk is present and heathy.

JeremeyWise_1-1675048514163.png

 

 

But the replica is "Unknown disks"   with out any note of node or target.

What would be prefered is to have it move these chunklet / objects into some kind of "garbage mode" where they could be retrieved.

Or to back trace the "Unknown disks" to where the clean up could be done with better confidence of what VM it would effect. (though I realize this is at lower level then a VM.

My concern is I delete it.  And I have no means to run check that it will re-validate current VMs that their supporting objects are within replication.. and deletion unknowingly kicked leg out from under a VM and I won't know.    

Maybe run a scan job.  Or storage vMotion VMs..  some means to "validate" VM is back with N+X replication.

 


Nerd needing coffee
Reply
0 Kudos
peetz
Leadership
Leadership

You can only delete/clear partitions of disks consumed by vSAN when the host is no longer part of the vSAN cluster.

Log in via ssh to the host and run

esxcli vsan cluster leave

After that you will be able to clear the partitions using the vSphere client GUI.

- Andreas

Twitter: @VFrontDe, @ESXiPatches | https://esxi-patches.v-front.de | https://vibsdepot.v-front.de
Reply
0 Kudos