I have ESXi host with 300GB space, and 5 Linux guests. All of them have been configured to use thin provisioned disk, and to use max 200GB.
First guest used about 100GB. So I tried to remove temp and some large files. Although I removed around 30GB from it, ESXi still shows that it is using 100GB.
A read on the forum that it isn't possible to reclaim this freed space without VMotion and moving this VM to another datastore.
Is this really true? Is there any tool or procedure to reclaim unused space without downtime?
Enable SSH on the esxi server in vpshere client (Configuration -> Security Profile -> Service Properties -> SSH options -> start)
Start a SSH client (putty is a good one for windows) and connect to your ESXi ip-address with root username and your password.
Go to a server with thin disks on your NFS datastore (cd /vmfs/volumes/[your NFS datastorename]/[your servername])
Make sure the server is powered off and has no snapshots. Punch some holes (vmkfstools -K [your diskname].vmdk)
Nothing happens on both my NFS datastores. When i do this on my ISCSI datastore I see a progress counter and the disk actually gets smaller.
Thanks for the prompt reply! OK I can confirm exactly what you are finding - unfortunately! So it looks like it is normal behaviour, which is a disappointment! A bigger disappointment is the fact that this is not a routine that's easily carried out in VMware tools or something, surely it's a common request for anyone using thin disks.
It certainly is a common request. Only logic I can see behind them doing nothing for years and years now is the fact that EMC bought them; maybe they want people to waste tons of disk space so they buy more storage. We decided to task one of our staff with zeroing and live migrating forward and back bloated VM's every few months and it never fails to recover several terabytes of wasted space on our arrays.
That is one of the few nice things about the horrendous new 5.5 interface; on the Home -> vCenter -> Virtual Machines screen, it gives you a chart of all of your machines and two of the columns are provisioned space and used space, and you can sort it by them. Makes it very easy to spot the bloated guests to target for cleanup.
For sure; if we couldn't live migrate, it would be a nightmare dealing with this. We can do our shrinking without taking the guests offline. They've really dropped the ball on this issue, and I've gotten no hint of anything in the pipeline to fix it.
Still keeps me wondering why the offline punch-hole method works on local and iscsi datastores and not on NFS datastores. I will try to find out if vmware will ever support this.
Thanks for testing
Hmmm... seems vsphere essentials does not have basic support. So I can not find out if vmware will ever support this :smileysilly:
Not to contribute to an old thread, but I have been researching this issue as well and discovered a major problem for anyone using SnapProtect (Simpana) and does SnapShot backups. Every time you punch a hole, (and I need to confirm this) but you likely invalidate any Snapshot backups and if you are migrating your VM's off to another storage unit you are effectively invalidating your initial seed backup for SnapVault - and migrating them back only causes the SnapShots to balloon even more as it see everything as new data! This is indeed a problem (or opportunity) that someone needs to figure out a solution too.
I am testing out SnapDrive "Space Reclaim Operation" (must be licensed for SnapDrive - also does not work if you have snapshots on the device) and so far its showing that it can do the job - however, I'm looking for a more automated way of doing this - for example vmwaretools could schedule it and then VAAI steps in to notify the storage array on the next data-scrub cycle (maybe it does this already, I don't know). But ultimately, running NFS (as recommended by NetApp/VMware best practices) seems to be getting short-changed here. I agree with the post above that perhaps EMC maybe involved in the slow integration of NFS support for hole-punching as NFS is dominated by NetApp. Surely, NetApp has a simpler solution than manually running SnapDrive on every VM (we have over 230 of them in our environment).
Since this is the thread I found when searching for this stuff, I thought I would update it to point to this VMWare Knowledge base article on the blocksize observation. Long story short, it is true, vmware says that is the way it is.
When I ran into this problem, I discovered a workaround which does not require the two volumes to have different block sizes but which requires two migrations (and a lot of free space): first, convert the disk (via migration) to "Thick Provision Eager Zeroed" (Lazy Zeroed does not work for this), zero out the empty space (using sdelete.exe or equivalent), then migrate it again and convert it to "Thin Provision".
I've used this technique extensively with ESXi 5.5 between two iSCSI-backed VMFS 5.58 datastores with the same (8MB) block size. Other restrictions still apply, of course, such as the machine being required to not have any snapshots.
I am running into the reclaim space issue just recently and I am already beginning to be frustrated.
I will explain our situation.
We are running vSphere v6.0U2 and have thick provisioned LUNs (due to performance reasons) So running UNMAP has no use. Also svMotion has also no use as the default (and unchangeable0 block size of the VMFS. We also have a mix Virtual environment Windows and Linux.
So I read about zeroing free space using tools like sdelete(Windows) and dd or sfill (Linux).
But the thing I really don't understand is that either I am using the tools wrong or no one is concerned about filling up the whole space and that it is a really time eating process.
And no I am not zeroing a couple of GB I have to clear TBs...
Everyone is talking like it is something that takes some time but no harm will be done.. I am totally stressed out that I run out of space within my VM.
As the tool is using resources and eating up all my free space I have to run the tool during off-hours (which we do not have a lot) But I have no idea how long the tool will run and have not yet found any document how long the tool will take for like 1TB.
So maybe someone can explain me how to use these tools and give me a time estimate? I know time is depending on a lot of factors but I think an estimate can be used.
Further more I am running sdelete -z on windows and have done testing using sfill -f on Linux (Ubuntu) both machine having about 1,5 TB free space in the VM it self. All our VMs are thin provisioned. (one side note testing the Windows VM I have been following ashawa's way.
Many thanks in any insights..
I think you would have been better off starting a new thread in the correct forum instead of kicking an old thread in the ESXi4 sub forum.
As for the time it takes to write out zero's to your virtual disks.. this depends on your environment, so it is difficult to say much about that.
Yes the zero-ing out process can be a problem on a live system as for example a database won't be charmed if it can no longer write out to the database AND the log files.
So that is one thing to take into account.
Reclaiming can also be done by using SVMotion to an NFS share.
You mention UNMAP has no use... but IF you don't have the space and you can take the VM offline then perhaps you can use the vmkfstools punchzero option.
The punchzero option is quite fast. But will of course only reclaim space that has zero's on the guest disk and as I mentioned in the link above it doesn't work on vmdk's living on an NFS LUN, but will work on normal VMFS.
thank you for the response.
I should have opened a new thread but is seemed this thread was still active
Could you give me a little more insight in why all of you guys don't mind the drive space to be used up.. ?
Maybe I could have done a better job by not letting a system (FileServer in this case) get up to TBs and also a rough 1,5TB free space ?
everyone is talking like it does take 'some' time.. I am running an sdelete right now, it has started at 16.00 and until now, 6 hours later, it's at 48% used up about 700GB.
What will happen when the sdelete is done? will the free space return ? Or will that happen after I run the vmkfstool -K ?
You need to have the FULL space available that your virtual disk can use.
What sdelete does is write out zero's and while it is doing that it will inflate the virtual disk.
Beware that you have to use the correct syntax (it changed somewhere over the years, according to my notes you have to use the -z option for zeroing out as of version 1.6)
So the effect of sdelete is that your virtual disk wlll take up the full space that was reserved for use.
Eg. if your thin virtual disk is 1.5TB in size, it will take up 1.5TB when your sdelete is done with zeroing out the empty space.
The reclaiming happens when you run the vmkfstools --punchzero (or -K) option. The main disadvantage of that method is that you can only do that after shutting down the VM.
The alternatives discussed earlier on in the thread do not need downtime, but do need appropriately sized LUNs with enough free disk space to perform the storage vmotion.
The way it works however is that it will reclaim the empty space that has zero'd out data. So if you don't have enough free space then you could still reclaim by partly zeroing out the disk or even do a storage vMotion without zeroing out. You still will reclaim some, just not as much as is technically possible.
I am a bit confused.
yesterday I finished a svMotion from NFS and back to VMFS(5) and I noticed no space has been Reclaimed.
So either I am misreading your post or something else goes wrong.
That is why I asked that zeroing needs to be done because if not, you won't have any zero'd data.
I have enough free space on my storage as the thin vmdk has already been inflated.
So now I am running a zeroing command (dd if/dev/zero .... because this machine is linux)
when this command is finished I will then do a svMotion to the NFS and Back again.
Anyway, many thanks for the input.