VMware Cloud Community
alfredosola
Contributor
Contributor
Jump to solution

Recovering from a broken snapshot chain

Good day,

A virtual machine with a Windows 2003 server guest, with VM Tools, was running on ESX 3.5 smoothly. After performing some maintenance I noticed that there was a number of old snapshots, made by an over-aggressive scheduling that was long ago fixed. I clicked on "Delete All" on Snapshot Manager. After several hours, the machine powered off apparently without user intervention. An attempt to power it up produced the error, "The parent virtual disk has been modified since the child was created". At that point, I made a couple of copies of the files and started trying to make sense of it. These are the vmdks:

Size

Name

CID

ParentCID

parentFileNameHint

342

x.vmdk

43c2386f

ffffffff

229

x-000019.vmdk

3abd27a5

0f81d0e9

x.vmdk

236

x-000001.vmdk

3abd27a5

3abd27a5

x-000019.vmdk

And these are the extents:

Size

Name

32212254720

x-flat.vmdk

251721728

x-000019-delta.vmdk

63488

x-000001-delta.vmdk

After much googling and reading (BTW, very informative site on vmdk recovery at http://www.sanbarrow.com/sickbay.html) I have spent almost 3 days trying to recover a handful of important files in there. I have tried to arrange the CID chain in every sensible combination, even leaving out the tiny snapshot numbered 000001. I have copied the files to my Fusion, while a coworker tried cloning in an ESX. The only thing that we have been succesful at is to boot the box as it was about 3 years ago, which is about its age; unfortunately, the files I need are either in the 000019 snapshot, or lost forever.

Being a really desperate attempt to recover just a few files, I am wondering if there is a way to force ESX, ESXi or Fusion to use the snapshot regardless of the modification to the parent disk. Or perhaps there is something I have overlooked. Any pointers appreciated. A professional recovery service is not out of the question, though I'd prefer to take the opportunity to learn a bit.

Tags (3)
0 Kudos
1 Solution

Accepted Solutions
continuum
Immortal
Immortal
Jump to solution

Here is a short summary of what I did ...

We arranged a remote session via RDP.

First I fixed the CID-chain of the broken snapshot-tree - using only snapshot 000019.

The other snapshot contained no data at all - as I found out by inspecting the delta-file with a hex-editor.

After that fix the VM could be started again but unfortunately snapshot 000019 was so corrupted that the VM could not boot anymore.

It said "ntdetect.com" is missing.

Next I booted the VM with a MOA-LiveCD and noticed that the disk was not readable at all.

I tried the usual procedure to recover a lost partition table with testdisk.

That was not successful either.

So next step was to try a raw recovery with Ontrack easy recovery.

This way I was able to extract about 14 Gb of data from the corrupt vmdk and copied that to a new vmdk I added to the VM.

I do not know how much of this 14 Gb of data is actually usable - that has to be checked file by file.

Summary: unfortunately there was no chance to recover the VM completely - I hope that the extracted data is useful though

Ulli




___________________________________

VMX-parameters- WS FAQ -[ MOAcd|http://sanbarrow.com/moa241.html] - VMDK-Handbook


________________________________________________
Do you need support with a VMFS recovery problem ? - send a message via skype "sanbarrow"
I do not support Workstation 16 at this time ...

View solution in original post

0 Kudos
18 Replies
wila
Immortal
Immortal
Jump to solution

Hi,

You are in "luck" as the person behind sanbarrow.com is active at the forum here.

He even offers commercial data recovery services if it is needed. I've notified him of your problem and will likely reply here if he has the time to help.



--
Wil
_____________________________________________________
VI-Toolkit & scripts wiki at http://www.vi-toolkit.com

Contributing author at blog www.planetvm.net

Twitter: @wilva

| Author of Vimalin. The virtual machine Backup app for VMware Fusion, VMware Workstation and Player |
| More info at vimalin.com | Twitter @wilva
0 Kudos
continuum
Immortal
Immortal
Jump to solution

Hi Alfredo

lets try to recover the full chain first before we try to extract raw data from the snap 19.

I need to inspect the descriptor-vmdks first. Please zip them together with a listing of the files you got - and attach it to your next post

Ullji




___________________________________

VMX-parameters- WS FAQ -[ MOAcd|http://sanbarrow.com/moa241.html] - VMDK-Handbook


________________________________________________
Do you need support with a VMFS recovery problem ? - send a message via skype "sanbarrow"
I do not support Workstation 16 at this time ...

0 Kudos
alfredosola
Contributor
Contributor
Jump to solution

Good day,

Thanks for your quick response. Here they are.

Later,

0 Kudos
continuum
Immortal
Immortal
Jump to solution

Hi

see attached zip-file.

I edited all 3 vmdks - and make sure that sedeq_w2k3-000001.vmdk is listed in the vmx-file !




___________________________________

VMX-parameters- WS FAQ -[ MOAcd|http://sanbarrow.com/moa241.html] - VMDK-Handbook


________________________________________________
Do you need support with a VMFS recovery problem ? - send a message via skype "sanbarrow"
I do not support Workstation 16 at this time ...

0 Kudos
alfredosola
Contributor
Contributor
Jump to solution

Hi,

Tried with those, but no luck. If I point the vmx file to 00001, I get the error "The parent virtual disk has been modified since the child was created". If I point to the vmdk without numbers (which I believe to be the parent, since its ParentCID is ffffffff), then I get the virtual machine running but no files since about 3 years ago.

I am testing in Fusion.

I tried to fiddle around with the Snapshot Manager a bit more, but no luck.

Any other ideas?

TIA,

0 Kudos
a_p_
Leadership
Leadership
Jump to solution

Actually the ParendCID of the snapshot has to point to the CID of the parent VMDK.

Due to the way a "Delete all" worked until ESX(i) 4 U2 (where VMware finally modified this strange behavior) I assume that snapshot 1 has already been merged into snapshot 19. So what I think you need to do is only to point the parentCID of snapshot 19 to the CID of the base vmdk.

In your VM's settings select snapshot 19 as the virtual harddisk file.

Maybe worth a try.

André

sedeq_w2k3-000019.vmdk

\# Disk DescriptorFile

version=1

CID=3abd27a5

parentCID=43c2386f

createType="vmfsSparse"

parentFileNameHint="sedeq_w2k3.vmdk"

\# Extent description

RW 62914560 VMFSSPARSE "sedeq_w2k3-000019-delta.vmdk"

\# The Disk Data Base

#DDB

There's a great documentation about Troubleshooting Virtual Machine snapshot problems

0 Kudos
continuum
Immortal
Immortal
Jump to solution

Can you post a filelisting and all vmware.logs you have ?

Hmmm - the edits I send should not result in a "parent has been changed message" even if Andres idea that the 000001.vmdk snapshot is an orphan is correct.

Lets see the logs




___________________________________

VMX-parameters- WS FAQ -[ MOAcd|http://sanbarrow.com/moa241.html] - VMDK-Handbook


________________________________________________
Do you need support with a VMFS recovery problem ? - send a message via skype "sanbarrow"
I do not support Workstation 16 at this time ...

0 Kudos
alfredosola
Contributor
Contributor
Jump to solution

Old and new logs attached.

To a.p.: I think what you say is likely, but I tried that one with no luck.

From some fun with strings and grep it looks like the data I need to recover may actually be in the 000019-delta.vmdk. At least, some filenames are mentioned which are not in the base disk.

Thanks everyone for your help so far, it's very encouraging.

0 Kudos
continuum
Immortal
Immortal
Jump to solution

I am bit confused about the last log you created on Fusion.

It says

scsi0:0.fileName = sedeq_w2k3-000002.vmdk

How is that ? - I thought you only had

sedeq_w2k3-000019.vmdk

sedeq_w2k3-000001.vmdk

sedeq_w2k3.vmdk




___________________________________

VMX-parameters- WS FAQ -[ MOAcd|http://sanbarrow.com/moa241.html] - VMDK-Handbook


________________________________________________
Do you need support with a VMFS recovery problem ? - send a message via skype "sanbarrow"
I do not support Workstation 16 at this time ...

0 Kudos
alfredosola
Contributor
Contributor
Jump to solution

Sorry, that is just some additional fiddling on my part. The snapshot you mention is not part of the original files, I created it yesterday (timestamp: 18 jun 09:43) trying to force the consolidation of 00019. Please ignore that one.

0 Kudos
continuum
Immortal
Immortal
Jump to solution

Do you by chance have a Windows machine with Workstation where I could login with RDP so that I can check directly ?




___________________________________

VMX-parameters- WS FAQ -[ MOAcd|http://sanbarrow.com/moa241.html] - VMDK-Handbook


________________________________________________
Do you need support with a VMFS recovery problem ? - send a message via skype "sanbarrow"
I do not support Workstation 16 at this time ...

0 Kudos
alfredosola
Contributor
Contributor
Jump to solution

Nope, but I'll arrange something. Stay tuned!

0 Kudos
mjschug
Contributor
Contributor
Jump to solution

I too am having a very similar problem with a server we had to move from 1 datastore to another. I'm now getting the same parent has changed error message. Any help would be appriciated.

Thanks!

-Matt

0 Kudos
continuum
Immortal
Immortal
Jump to solution

Hi Matt

very similar is not the same Smiley Wink

Please create a post of your own so that we don't mix it up here.

If you want - check my site http://sanbarrow.com/sickbay.html#anamnesis

there I list up what data is required to fix issues like this

I am on the road on saturday so I can't check back before sunday

Ulli






___________________________________

VMX-parameters- WS FAQ -[ MOAcd|http://sanbarrow.com/moa241.html] - VMDK-Handbook


________________________________________________
Do you need support with a VMFS recovery problem ? - send a message via skype "sanbarrow"
I do not support Workstation 16 at this time ...

0 Kudos
continuum
Immortal
Immortal
Jump to solution

Here is a short summary of what I did ...

We arranged a remote session via RDP.

First I fixed the CID-chain of the broken snapshot-tree - using only snapshot 000019.

The other snapshot contained no data at all - as I found out by inspecting the delta-file with a hex-editor.

After that fix the VM could be started again but unfortunately snapshot 000019 was so corrupted that the VM could not boot anymore.

It said "ntdetect.com" is missing.

Next I booted the VM with a MOA-LiveCD and noticed that the disk was not readable at all.

I tried the usual procedure to recover a lost partition table with testdisk.

That was not successful either.

So next step was to try a raw recovery with Ontrack easy recovery.

This way I was able to extract about 14 Gb of data from the corrupt vmdk and copied that to a new vmdk I added to the VM.

I do not know how much of this 14 Gb of data is actually usable - that has to be checked file by file.

Summary: unfortunately there was no chance to recover the VM completely - I hope that the extracted data is useful though

Ulli




___________________________________

VMX-parameters- WS FAQ -[ MOAcd|http://sanbarrow.com/moa241.html] - VMDK-Handbook


________________________________________________
Do you need support with a VMFS recovery problem ? - send a message via skype "sanbarrow"
I do not support Workstation 16 at this time ...

0 Kudos
wila
Immortal
Immortal
Jump to solution

:smileyshocked: Woww.. that must have been pretty time consuming.

Hope that the recovered data is of value to the OP.

--

Wil

_____________________________________________________

VI-Toolkit & scripts wiki at

Contributing author at blog www.planetvm.net

Twitter: @wilva

| Author of Vimalin. The virtual machine Backup app for VMware Fusion, VMware Workstation and Player |
| More info at vimalin.com | Twitter @wilva
0 Kudos
alfredosola
Contributor
Contributor
Jump to solution

Hi,

First of all, many thanks Ulli for all your help and dedication. We wouldn't have gotten anything back without you.

Some of the recovered data is indeed useful. Pity that not all the data was recovered, but that was a tall order.

The lessons learnt here have been said many times but once again: Snapshots are not a substitute for a proper backup. There are many good backup products out there that will make your life very easy for backing up virtual machines; Veeam comes to mind, just to name a popular one. Or, you can just script out cloning and rotating with vMA.

Anyway. RAID is not backup, snapshots are not backup, and when the proverbial organic matter hits the fan, only a separately-stored backup will save the day. Or hiring Ulli, that is Smiley Happy

0 Kudos
continuum
Immortal
Immortal
Jump to solution

.. speaking about backup-tools ... if you consider esxpress you may have me on the phone when you ever run into problems Smiley Wink

Ulli




___________________________________

VMX-parameters- WS FAQ -[ MOAcd|http://sanbarrow.com/moa241.html] - VMDK-Handbook


________________________________________________
Do you need support with a VMFS recovery problem ? - send a message via skype "sanbarrow"
I do not support Workstation 16 at this time ...

0 Kudos