VMware Cloud Community
GlenB
Contributor
Contributor

Tool to un-corrupt a redolog?

Anyone interesting in writing a utility for the rest of us? It has happened to me and I have read that it has happened to others as well - "the redolog has been detected to be corrupt". In each case for me it happened because of an abrupt power-down of the host machine. If VMware cannot write code that self-protects from such events, then we need something to clean up afterwards. The only remedy provided is to delete the redolog which means losing all of the changes that were in that log ... could be many Gb of data.

My gut tells me that it was corrupted because a change was started and never finished. That means there is one thing, the last thing done, in those Gb of data that is now garbage. All the rest is OK - it's a shame to lose it. Even relying on the previous day's backup - if you have such a work practice - does not get you back everything.

So the utility I am thinking of needs to look at a VM, find the corrupted redolog (the _{disk#}-00000#-delta.vmdk file), find that last transaction, and edit the -delta file to make it look like that transaction never happened. After that, the user gets told to manually add a snapshot and delete all snapshots (which should succeed) and the disk is back to a -flat file and all the corruption is gone. The last transaction is lost.

Anyone interested?

Regards - Glen

Regards - Glen
0 Kudos
1 Reply
wila
Immortal
Immortal

Anyone interesting in writing a utility for the rest of us?

Would you buy a tool like that? How much would you be willing to spent on this? Are you hoping for developers to work for free on this, it involves reverse engineering a number things..

My gut tells me that it was corrupted because a change was started and never finished. That means there is one thing, the last thing done, in those Gb of data that is now garbage. All the rest is OK - it's a shame to lose it. Even relying on the previous day's backup - if you have such a work practice - does not get you back everything.

My gut feeling tells me that you are severely oversimplifying things...

So the utility I am thinking of needs to look at a VM, find the corrupted redolog (the _{disk#}-00000#-delta.vmdk file), find that last transaction, and edit the -delta file to make it look like that transaction never happened. After that, the user gets told to manually add a snapshot and delete all snapshots (which should succeed) and the disk is back to a -flat file and all the corruption is gone. The last transaction is lost.

I doubt severely that it works like a database where you have identifiable transactions. Sure the terminology "transaction" sort of applies, but it is a Copy On Write mechanism. IOW if you would change the same file 2 times, it will not add the changes two times, your first version of the file gets overwritten by the second overwrite. It also means that "transactions" as you call them -I would call them block writes- are not just stuck at the end of the redo-log (snapshot file really).

As a result you don't know what was written as the last "transaction" and there's no way to find out.

Now I'm not saying that it won't be possible to recover the delta file in a way that most data can be committed, there might be ways, but I'm pretty sure it is far from like you imagine it works in your write down here.

I'm pretty sure that I'm still oversimplifying things here too, but it is how I believe snapshots work.

If someone in the know (Ulli?) thinks i'm way off base, then I'd like to hear that too.



--
Wil
_____________________________________________________
VI-Toolkit & scripts wiki at http://www.vi-toolkit.com

Contributing author at blog www.planetvm.net

Twitter: @wilva

| Author of Vimalin. The virtual machine Backup app for VMware Fusion, VMware Workstation and Player |
| More info at vimalin.com | Twitter @wilva
0 Kudos