mstevens492
Contributor
Contributor

esxcli storage vmfs unmap - doesn't reclaim all available storage?

We are using 4TB thin provisioned LUNs on our netapp.  I have a 4TB datastore mapped to this lun (snapshots/reserve are turned off on the netapp side).

I am testing space reclamation with the esxcli storage vmfs unmap command.  I filled the datastore to about 95% full, so around 3.8TB.  I then deleted half of everything, and the datastore in ESXi now shows 2TB free.  So, I run the esxcli unmap command, and it recovers only about 100GB of space.  If I run it again, it'll recover another 70GB.  If I run it again it'll reclaim another 70GB.  Etc and so on.

So my question is, why is the unmap not reclaiming the full 2TB of space I freed up, and I need to keep running the unmap command over and over and over?

0 Kudos
8 Replies
vNEX
Expert
Expert

Hi,

maybe due to some failed unmount iterations ...can you please look to ESXTOP to device view (u) with "ao" filed's enabled.

Then check what values are in columns:

DELETE - total sum of successful unmap iterations

DELETE_F -  unsuccessful UNMAP operations

MBDEL/s - MB deleted per second

In addition please post info about affected VMFS volume:

# vmkfstools -Ph -v 1 /vmfs/volumes/<volume label>

Thanks

Message was edited by: vNEX

_________________________________________________________________________________________ If you found this or any other answer helpful, please consider to award points. (use Correct or Helpful buttons) Regards, P.
0 Kudos
a_p_
Leadership
Leadership

Did you run the command with default values for e.g. --reclaim-unit=xxx

Please see http://kb.vmware.com/kb/2057513 for details about command line options, and check what's supported/recommended by your storage vendor.


André

0 Kudos
mstevens492
Contributor
Contributor

@vnex

ESXTOP shows the following for my test lun device:

delete = 3702810

delete_F = 0

MBDEL/s = 0

Should I try watching esxtop while the vmfs unmap is actively running?

Here is the volume info.

VMFS-5.58 file system spanning 1 partitions.

File system label (if any): prodsan01aa_vmlun9_SAS

Mode: public ATS-only

Capacity 4 TB, 2.4 TB available, file block size 1 MB, max file size 64 TB

Volume Creation Time: Mon Oct  7 01:00:26 2013

Files (max/free): 130000/129674

Ptr Blocks (max/free): 64512/62788

Sub Blocks (max/free): 32000/31902

Secondary Ptr Blocks (max/free): 256/256

File Blocks (overcommit/used/overcommit %): 0/1728124/0

Ptr Blocks  (overcommit/used/overcommit %): 0/1724/0

Sub Blocks  (overcommit/used/overcommit %): 0/98/0

Volume Metadata size: 825131008

UUID: 525207aa-c9e29fce-f556-68b599b3de4c

Partitions spanned (on "lvm"):

        naa.60a9800041764c745a24436a444c5374:1

Is Native Snapshot Capable: YES

OBJLIB-LIB: ObjLib cleanup done.

@a.p.

We are using netapp, and I was unable to find a recommendation from them on what size to use for the units.  I have tried running it though with the default 200, and other various sizes all the way up to 3000 with no change in the behavior I described in my original post.

0 Kudos
mstevens492
Contributor
Contributor

Also, here is what the netapp reports for the LUN.  Notice it says 3.3TB used.  It was at 3.9TB but after running the vmfs unmap a bunch of times I've gotten it down to 3.3.  But, it should says 1.6TB used.

lun show -v /vol/aggr1_vol0/prodsan01aa_vmlun9_SAS

        /vol/aggr1_vol0/prodsan01aa_vmlun9_SAS    4.0t (4398314946560) (r/w, online, mapped)

                Share: none

                Space Reservation: disabled

                Multiprotocol Type: vmware

                Occupied Size:    3.3t (3618815295488)

                Creation Time: Sun Oct  6 20:58:42 EDT 2013

                Cluster Shared Volume Information: 0x0

As mentioned snapshots are not enabled on the underlying volume on the netapp.  Fractional reserve is also disabled.

0 Kudos
gaspipe
Enthusiast
Enthusiast

Hi.

We had a similar problem with ESXi 5.5 and 3PAR storage. What our support tech suggested was to run

esxcli storage vmfs unmap -l datastorename --reclaim-unit=999999

To see what is the max number for reclaim-unit (the command should fail and state that the max number is ...).

Then edit the previous command to include that max amount as --reclaim-unit, run it, and wait for it to finish.


In our case we saw a decrease of used storage on 3PAR LUN of approx. 400GB (out of about 4TB) in ~20 hours after unmap finished, and the process is still continuing. Results on other storage may vary Smiley Happy

HTH.

0 Kudos
mstevens492
Contributor
Contributor

Thanks for the suggestion.  I gave it a try, found the max blocks number, ran the unmap, and still the same result.  Smiley Sad  I do have a ticket open with both netapp and vmware, so I'll respond here if a solution is found.

0 Kudos
jnedela1
Contributor
Contributor

Were you able to determine the correct max number for the reclaim-unit parameter?  If so, what came back?  I ran the command

esxcli storage vmfs unmap -l datastore -n 999999

However, it did not error out and suggest a max number; rather it simply ran.  this is a 3.75TB datastore, so can probably handle the large number (999999 x 1mb is ~1TB) but it was a bit disconcerting it didn't come back with a recommendation.  Is there a way to properly determine or calculate this variable for NetApp hosted datastores/volumes?

0 Kudos
ictadminrbassi
Enthusiast
Enthusiast

Hi

Did you ever manage to find a solution for this? I am also having the same issue with our NetApp FAS8020 storage.

Would really appreciate if you have any advice on this as its driving me crazy! I myself have tickets open with VMware and NetApp and so far VMware have brushed their hands off to says its an issue with the storage.

I have tried all the options outlined here but get the same issues as you have seen previously.

Thanks

0 Kudos