Zaib_Khan
Enthusiast
Enthusiast

VSAN datastore is not retreiving space after deleting VM from Disk.

Hi All,

I am running VXRAIL VSAN solution in my organization and currently i am facing hectic issue with my VSAN datastore free Space.

My current free space is 5 TB, but when i am going to delete the virtual machines from disk then space remains the same.

Recently i deleted the VM comprises 2 TB space from Vcenter (Delete from Disk), as per expectations the VSAN Datastore free space will increase up to 7 TB from previous 5 TB, but it remains the same as 5 TB.

But when i am going to create any VM in Vcenter, then VSAN free space starts decreasing that it decreases from 5TB to downwards.

My concern is that why VSAN datastore is not increasing free space after deleting the Virtual machine. 

0 Kudos
7 Replies
TheBobkin
VMware Employee
VMware Employee

Hello Zaib_Khan

"Recently i deleted the VM comprises 2 TB space from Vcenter (Delete from Disk), as per expectations the VSAN Datastore free space will increase up to 7 TB from previous 5 TB, but it remains the same as 5 TB."

Was this VM actually using 2TB (or any) space though?

This can be determined by observing how much space it is consuming on disk (VM > Edit Settings > Hard Disk > Drop-down info) - e.g. a Thin-provisioned 2TB vmdk added to a VM will take up near zero GB until data has been placed/generated on it.

You can get a much better idea of how much space is being utilised by each VM Object using RVC:

> vsan.vm_object_info localhost/DataCenterName/computers/ClusterName/resourcePool/vms/*

(This will print out the info for ALL VMs so ensure the SSH session buffer is enough lines to print it all out)

Individual VMs can be looked at one at a time using:

> vsan.vm_object_info localhost/DataCenterName/computers/ClusterName/resourcePool/vms/NameOfVM/

(Note if you are using created Resource Pools other than default root pool then change paths accordingly)

You can also pull all this (and more) information using this very nice tool that is available via CLI:

# python /usr/lib/vmware/vsan/bin/vsan-health-status.pyc

Where are you checking free space from?

Does 'df -h' via CLI show the same output as in the Web Client?

Try looking at before and after deleting a VM in RVC using vsan.disks_stats <path_to_cluster>

You can even identify which exact capacity device the components you are deleting are located on using the tools above and or the 'Virtual Objects' tab in Web Client.

I have colleagues in VxRail team that can assist you with looking into this further if need be.

Bob

0 Kudos
Zaib_Khan
Enthusiast
Enthusiast

Hi TheBobkin,

Thanks for the update and support.

I have explore more deeply on this issue and i found that i have set the "Number of failures to tolerate value" = 1 in Default storage policy which makes the couple of redundant copies in VSAN datastore.

So i created a test storage policy, and set the "Number of failures to tolerate value" = 0 and move some VM to this test policy. By doing this now 3 TB space is increasing in my VSAN datastore.

Kindly suggest that  am I doing the right thing? or should i have to do some other recommended steps as per VMware recommendations. 

0 Kudos
cyberpaul
Enthusiast
Enthusiast

Hi,

decreasing "Number of failures to tolerate" is not a good idea. Yes, you've bought yourself some space, but you might lose your data if one of your devices fails. Please take a look at this for more detailed info:

About Virtual SAN Policies

As for the original issue, I think TheBobkin was onto something. The important message was that VM with 2 TB of provisioned space might in fact occupy much less disk space than that. This feature is called thin provisioning. You should take a look at "used space" rather than "provisioned space". By deleting a VM, one would expect to free the "used space" amount of data.

Regards, Pavel

0 Kudos
jjonesprh
Contributor
Contributor

I have the same issue.  The disk was thick provisioned.

I deleted a 1.5TB second disk on a vm.

Looking at the files in the Vm folder, I still see the second disk so that would explain why there was no recovery of the space in vsan.

When I deleted the disk, I still had replication enabled for the VM, so maybe that had something to do with it not cleaning itself up.

I removed replication but now I'm assuming I need to manually delete the file because it is still there.

Update: Deleting the file had no effect, I also don't see vsan resyncing anything.  I will open a ticket with EMC later

0 Kudos
TheBobkin
VMware Employee
VMware Employee

Hello jjonesprh​,

How exactly did you 'delete' this vmdk Object?

Potentially you just deleted the descriptor (which is just a text file that points to the Object).

Check via RVC if you have an Unassociated Object that matches the expected number of components of a 1.5TB Object (Probably around 16 if FTT=1 and splitting per 200GB).

You can also identify Objects that used to belong to a VM by pulling the info on all Objects and then narrowing these down to just Objects with the GroupUUID of this VM (e.g. the namespace Object UUID). If you are not so handy with cmmds-tool + objtool then I would advise using this script to pull the info /usr/lib/vmware/vsan/bin/vsan-health-status.pyc - you *should* even be able to search for Objects with the friendly-name of the VM from the information this generates (regardless of whether the descriptor was deleted or not).

Bob

0 Kudos
jjonesprh
Contributor
Contributor

Bob, Thanks for the information. 

This morning when I checked the datastore size, about 1TB was released.  Since things are mirrored and was thick provisioned, I expected more back but maybe it was dedupe.

I ran the python script and didn't see any matching data for the second disk I deleted.  Everything looked normal as best I could tell.

This action of not releasing the space after drive deletion, occurred on two separate vxrail sites, so it must be a bug of some kind.

The second site still shows no space recovered, but I didn't delete anything from that site yet either.

To answer your question on the deletion, I deleted just the vmdk file that showed the largest file size.

0 Kudos
jjonesprh
Contributor
Contributor

Updates:

For the site where I deleted the pointer to the disk:

VMware engineers found no lingering or inaccessible objects so the system must have cleaned itself up.

For the second site where I took no actions beyond deleting the secondary disk using the web interface:

VMware engineers located the lingering disks and manually deleted all of them.

Space is recovering now.

We don't really know why the system was not releasing the space

0 Kudos