VMware Cloud Community
Buggycrash
Contributor
Contributor

Zero only previously deleted file space?

Good Morning All,

I'm hoping somone can help me with a special set of circumstances I have.

I have the requirement to export RHEL5 Linux VMs as Virtual Appliances periodically as a contract deliverable for my customer.

These VMs are running in ESXi and the VMDKs are on a NetApp device as NFS mounts from ESXi. The VMDKs are initially thin provisioned.

When we export, we have found that zeroing all unused disk space first can significantly reduce the size of the resulting VA. We are a development shop with lots of changes happening to the VM between VA exports. Zeroing deleted file space prevents the VA process from exporting those blocks.

The problem: If we start with thin provisioned VMDKs, and then zero the the unused space on the partitions using the popular method of creating a big file full of zeros, then deleting it, we are essentially no longer thing provisioned. Creating the zero file writes to the entire virtual drive, which causes ESXi to allocate file space the the VMDK. Our 8GB thin provisioned VMDKs become 50GB VMDKs which means I have to allow for this greater size in my NetApp volume allocation. Remember, I am in a development shop. I do not need the speed associated with flat VMDKs. I need available storage space to allow for rapid and temporary provisioning of new VMs. If I can prevent this growth, I have more storage space available for other stuff. No, I can't use Dedup/ASIS on the NetApp. It is a 270c and Dedup/ASIS is not supported.

What I'm looking for us a Linux utility that will zero only the blocks that were previously used for files. It appears that a utility exists for this in the Windows world, SDelete by Mark Russinovich, but I haven't found a similar tool in the Linux world.

If I understand correctly, this type of tool will help me reduce the size of the VAs by turning deleted file space into zeros, which the VA export process will then not export, and will keep the VMDK thin provisioned by only writing to parts of the VMDK that have been written to before.

I welcome all suggestions.

Thanks for your time!

Reply
0 Kudos
7 Replies
nick_couchman
Immortal
Immortal

I don't know of a Linux tool to do this. The problem is that your interacting with both the filesystem and the disk - you need to know areas that are available to be zerod without writing over files. Some of the filesystems in Linux may have a way to do this, or VMware Tools may be able to help you do this (there's a "shrink disk" function - maybe that's similar to what you're trying to do??

I'm not sure if it would work, but have you tried using vmkfstools to copy/migrate the disk from one file to another, specifying thin provisioning for the new disk type? This may allow you to copy the disk in such a way that only real data is transferred and the rest of the space is kept zerod (or not allocated at all). Again, not sure - I've never tried it - but maybe worth a shot?

Reply
0 Kudos
Buggycrash
Contributor
Contributor

Nick,

Thanks for the info. I may be wrong, but I don't believe that the vmkfstools work with ESX/ESXi. I think I have read in the past that they are only for VMware Server (GSX).

I should have mentioned, I am using ext3.

I have seen some mention of undelete tools. I was thinking that perhaps I could use one of those tools to identify the blocks that "used" to be files, and then find some way to write zeros to only those. I think that is essentially what SDelete does on NTFS. I don't know ext3 well at all, so perhaps it is not possible with this file system.

Thanks again.

Reply
0 Kudos
nick_couchman
Immortal
Immortal

No, vmkfstools work just fine on ESX and ESXi. You can access that tool via the Service Console on ESX, the Unsupported console on ESXi, or by installing the RCLI tools that connect to both ESX and ESXi.

Buggycrash
Contributor
Contributor

Interesting! I guess I need to read up on those tools.

The other snag in this equation is that I'm using scheduled NetApp snapshots, so even if I am able to use the vmkfstools to migrate to a thin provisioned VMDK, I still have the bloated VMDK sitting in my snapshot queue for days/weeks. That's why I was looking to do this by writing only the data I absolutely must. That is really the battle I have. NetApp snapshots are nice, but the delta on deleting 50GBs of data is, well, I believe 50GBs. In our development shop we pop these VMs in and out of existence like candy, so my 2.2TB storage device grinds down to a handful of GBs very quickly.

The one thing I can think to do is to try my best to keep the VMs thin provisioned. Exploding them and shrinking or converting them will still create a huge differential track in the NetApp snapshots which I need to avoid. We have to minimize the VA size though, so I don't see a way to eliminate the step that zeros previously used file space. Without doing the "big file of zeros" our VAs are larger than a Dual Layer DVD, but with the zeroing the VAs are about half a single layer DVD. I can't expect all our VA recipients to have Blu-Ray, so right now Dual Layer DVD is our practical export size limit.

I'll learn what I can about the tools though. Since I thought they didn't apply I never really looked at their capability.

Thanks for your help!

Reply
0 Kudos
nick_couchman
Immortal
Immortal

Is there no possibility of adjusting the snapshot schedule for the ESX volume on the NetApp appliance to a schedule that works better for that environment? Or doing the snapshots manually? Or do you still need the regular snapshots of the VMs, you're just trying to avoid getting all the extra data?

Buggycrash
Contributor
Contributor

Well, I have been pretty aggressive on the snapshot schedules. Since we

are a development shop, I have hourly snapshots during regular business

hours just in case the developers need to revert. After a year or so,

we have only had to revert a couple of times, so I can probably relax

the snapshot schedule, or perhaps turn it off during our export process.

I'm not sure if that will reduce my storage usage though. I may end up

with fewer snapshots, but each of them larger reflecting the greater

data change between snaps since the NetApp never really deletes a piece

of data until all the snapshots which point to it expire. I'll have to

test.

It's worth a shot though. Thanks!

Reply
0 Kudos
Buggycrash
Contributor
Contributor

Although I still haven't found the tool I'm looking for, I will look into vmkfstools and take a hard look at rescheduling my NetApp snapshots.

Thanks!

Reply
0 Kudos