VMware Cloud Community
SayNo2HyperV
Enthusiast
Enthusiast

vSAN GUI Object MGT?

Just starting to learn VSAN.  Starting with fresh vSAN 6.2.  Learning with Nested Cluster on a single PE T610 backed by local disks.  Just playing with two node + witness.  Nested VSAN hosts have Disk groups backed by separate RAID1s on T610.  Fake SSD vdisk+ 4 15GB vdisks.  2 vsan vmk/subnets.


Playing with it for few days and everything seemed to be working.  Though...very poor performance.  It is all meant for learning so wasn't concerned.  This morning playing with performance stress test and it failed.  And did some damage  Vsan health check then informed about 32 inaccessible objects. 


So began learning the procedure for vsan object management.  Nightmare.  After several hours learning objtool + RVC I am amazed how crazy this is to not have a GUI for object management + tools to purge objects in bulk.  Only found mass purge for vswap. 


For fun I ran stress test again...inaccessible objects doubled again...vsan overhead on capacity keeps growing.  I'm sure the Nested environment is causing this issue but it really has me wondering about management of vSAN in large environment....


It cannot be that Vmware wants each UUID manually entered to delete these objects.  Jumping between ESXI host console objtool + Ruby console?  All CLI?  No centralized GUI?


Does Vmware have intention of putting more attention on vSAN object management?  I hope it's not here is because its being developed for HTML5 web client.


P.S.  If you know why stress test is failing and leaving all these inaccessible objects please share


Thanks.  TTFN.

Tags (1)
0 Kudos
1 Reply
elerium
Hot Shot
Hot Shot

The management/gui/diagnosing of inaccessible objects is somewhat of a pain if you ever run into this issue. I've seen it in production environments, usually related to upgrading VSAN disk formats in addition to dev environments where VSAN is run on unsupported hardware where the disks generally can't keep up.

My best guess is your stress test is failing because your underlying disk hardware for your test lab can't keep up (fake SSDs and or nesting will do this). During a stress test, if there is no I/O being delivered for some time it will count as a failure (will occur if your disk latency increases too much or disk commands are issued but your disk subsystem is too slow to respond for prolonged periods). If this happens, the .VMDK files created by the stress test may not be removed. You can go to the datastore and just delete the stress test/folders/VMDKs.

If what I mentioned above doesn't remove your inaccessible objects, then yes you'll have to use a combo of objtool+ruby cli to correct things.

0 Kudos