VMware Cloud Community
madajiq
Contributor
Contributor

Delete middle delta file. Snapshot removed but Delta file not consolidated due to host disk full.

Hi everyone,

I have a question regarding deleting the middle chain of delta files from the data browser.

 

These is the scenario:

 

There is a VM is taking up space due to unconsolidated delta files even after performing the "delete all" from "Manage Snapshots" due to disk full. I have tried to consolidate it while turning off the VM with 3TB of space, but still the disk usage inflated to almost fully utilized my host disk and after that, returns Error of insufficient disk and left 33Gb of free space after the long long time of consolidating process. The VM is thin-provisioned for ~5TB, but now it has almost fully utilized my host disk which has 19TB of size.

 

After delete all snapshot from "Manage Snapshot" Menu, we go through the data browser and found out that the in middle of the delta files chain, there are 2 delta vmdk file that utilizes almost 10TB of storage. Now, this server was not maintained by us previously and they probably did not perform a proper disk consolidation after performing snapshot backup.

madajiq_1-1685416129421.png

Culprits:

000004.vmdk - 4.3TB

000005.vmdk - 5.5TB

 

So coming into the question:

 

1. Can I delete the 2 delta files directly from datastore browser, without affecting the state of my machine ?

 

2. what about the probability of data loss or data corruption ?

i have read a similar post about this (https://communities.vmware.com/t5/ESXi-Discussions/Confirm-Have-4-snapshots-want-to-remove-the-2-mid...) but these are being discussed:

   (a) The data will be consolidated to the latest delta files. Assuming this will not affect the machine state.

   (b) The data will be consolidated to the parent delta files. (Confused of this explanation)

   (c) No data loss or corrupt is discussed

If I delete the 2 middle delta files and point it properly in the VMX file to fix the chain, will it be consolidated to the latest snapshot i performed without any data loss ?

 

3. Can i boot it up with latest data, only with Base Disk(_1.vmdk) and edit VMX file to point delta disk (_1-00008) to the Base Disk (_1.vmdk) directly?

 

note:

This is a discussion. I cant replicate 100% through my lab environment and just trying to understand the concept. Looking for advise from someone who have experienced this type of incident in the past. Currently the server is powered down to avoid consuming more data.

 

Thank you

 

Reply
0 Kudos
7 Replies
a_p_
Leadership
Leadership

You can't delete any files in a snapshot chain, because a snapshot contains data blocks which have changed after creating the snapshot. Deleting one of the files will result in data loss, and a corrupted snapshot chain!

To find out how much additional disk space may be required for a successful consolidation, please provide the free disk space on the datastore, and a complete file listing of the VM's files. For this run the following tow commands from within the VM's folder, and attach the resulting filelist.txt to your next reply. You can of course rename the file names in the .txt file if you want, but do this only for the VM's name, and not for additional file name contents.

df -h > filelist.txt
ls -lisa >> filelist.txt

André

madajiq
Contributor
Contributor

Hi @a_p_ ,

Thanks for replying !

I have attached the filelist.txt as request. Free disk is now ~40GB after we purge Old data from the VM, but still not enough to perform disk consolidation.

 

However, i would like to know if you have done or perform something similar to this ?

 

  • Modify the VM configuration file (VMX) ide0:1.fileName or scsi0:1.fileName entry. Change the filename to point to the 00008 delta file,

  • Remove or rename the intermediate delta VMDK files: Delete or rename the 00004.vmdk and 00005.vmdk files to prevent the virtual machine from attempting to use them during boot-up.

  • Edit the descriptor file (VMDK): Locate the descriptor file associated with the 00003 delta file

  • Modify the parentCID: Within the descriptor file, locate the line that starts with "parentCID". Update the value to match the CID of the 00006 delta file. The CID is a unique identifier associated with each delta VMDK file. Save the changes to the descriptor file.START:

parentFileNameHint="Fisher_1-000003.vmdk"
# Extent description
RW 12582914146 SESPARSE "Fisher_1-000006-sesparse.vmdk"

END:

 

Logical :

00001.vmdk

00002.vmdk

00003.vmdk

00006.vmdk

00007.vmdk

00008.vmdk

  • Restart the virtual machine

 

 

I tried this to setup in my lab, but it cant boot up maybe due to my lacking configuration on ESXi. Is this procedure if done correctly, able to boot up the VM ?

 

 

Thank you !

Tags (1)
Reply
0 Kudos
a_p_
Leadership
Leadership

I've modified virtual disk metadata many times, but usually to fix things rather than to break them (except for break&fix tests).
Deleting any of the files in this case will result in data loss and data corruption, so this is likely not the way you want to go.

Options to get rid of these snapshots (the oldest one from 2019) are:

  1. backup the VM, delete the VM, and finally restore it from the backup
  2. temporarily free up about 1.2 TB (better a little bit more) to allow the consolidation ("Delete All") to complete

I don't know about your VMs and the uptime requirements, but I'd prefer option 2 if possible.
A successful consolidation will free up >14TB on the datastore.

André

Reply
0 Kudos
madajiq
Contributor
Contributor

"I've modified virtual disk metadata many times, but usually to fix things rather than to break them (except for break&fix tests).
Deleting any of the files in this case will result in data loss and data corruption, so this is likely not the way you want to go."

- So, in a way, if we did not worry about any data loss, we can actually execute the metadata editing?

 

and would you mind to share on how you calculated the storage space needed to allow the consolidation to complete ?

 

Thank you!

Reply
0 Kudos
a_p_
Leadership
Leadership

>>> So, in a way, if we did not worry about any data loss, we can actually execute the metadata editing?

Of course you can. In this case - assuming that the snapshot are in the sequence of their file names (should be confirmed before deleting files) - you could "revert" the VM to the state of e.g. Dec, 6 "Fisher_1-000004.vmdk", and delete the newer snapshots files 000005 though 000008, which will free up about 1,174,989,824 kB, i.e. a bit less than the possible maximum required space.

>>> how you calculated the storage space needed to allow the consolidation to complete 

The base virtual disk's .vmdk file currently consumes 5,059,914,752 kB. With a provisioned size of 6,291,457,073 kB, and snapshots that are larger than the difference between these two values, the flat file could grow up to the provisioned size.
The difference is 1,231,542,321 kB, which is about 1,23 TB.

12586756 5059914752 -rw------- 1 root root 6442452042752 Dec 6 08:04 Fisher_1-flat.vmdk

André

madajiq
Contributor
Contributor

Hi @a_p_ ,

 

I am sorry for delayed response as i was trying to comprehend your reply and did some research.

 

Can you please help to explain, why is the VM storage currently consumes 5.09 TB but the actual data inside the OS which run under the VM is only 300GB since we automatically purges all the data when it hits the threshold. THIS is what really confuses us on how the VM storage is working.

 

really appreciate your view on this !

 

Thank you

Reply
0 Kudos
a_p_
Leadership
Leadership

>>> I am sorry for delayed response ...
No need to apologize, you've got a problem, not I 😉

Physical disk space gets allocated if the guest OS writes to previously unused disk locations, but that disk space will not get released if the ESXi version, the underlying storage, and guest OS do not support automatic space reclamation. Please refer to VMware's documentation for "Space Reclamation" as this is a huge topic.

In case space reclamation is not supported with your environment, you can still reduce a thin provisioned virtual disk's physical disk space from the command line, after zeroing unused disk space from within the guest OS. See e.g. https://www.yellow-bricks.com/2011/07/15/punch-zeros-2/
Please note that this must NOT be done on virtual disks with snapshots!

André

Reply
0 Kudos