VMware Cloud Community
comfine
Contributor
Contributor

copy of flat.vmdk after raid-crash contains old data-state?

Hello community,

I will explain the hole story for better understanding - even if that is not necessary.

last saturday our server was not reachable anymore. i logged in into the VMware vSphere-Client and tried to connect to the virtual host of our server. there was a bluescreen.  first i started to reboot the virtual machine and it tried to start but the explorer didn't start correctly: no taskbar, only a command-window, i think it was the one i put someday in the autostart. i waited, but nothing changed - it didn't react on STRG+ALT+DEL. i tried to reboot the VM a second time. now i got the error-message, that the Virtual HDD-File could not be accessed. This Virtual HDD-File contains our hole business-data, which was mapped to drive E:\ in the VM. The System itself was located on a RAID1 and is still working without a problem. After the Problem with the Access of the Vritual-HDD i restarted the ESXI. The HP ProLiant RAID-Controller told me, there is a problem with the RAID5 (which contains the corrupted virtual HDD), and gave me the option to choose between F1 (continue) and F2 (enable and continue, can cause data loss). i tried with F1. the complete RAID5 was not available in ESXI. I rebooted again and choose F2 (maybe i mix something up here and it was F1 and F2 at the first boot...). now the RAID5 was listed in VMware vSphere-Client again. I tried again to start the Virtual Machine, but still the same error. Then i tried to download the Virtual HDD (vmdk) to a local usb-drive of the vsphere-client-machine for backup. After 3,1GB it stopped with an E/A-Error. I tried this again, same problem. At the next reboot the RAID-Controller told me, that he has found previewosly interrupted repair-process and if i want to continue - i said yes. i continued the boot-process and tried to download the VMDK-file again, and it did stop exactly at the same point after 3,1GB. Now i logged in via SSH and did a dd of the corrupted RAID5 to another RAID5 on the same machine wiht skipping errors. some hours later it told me that it has copied everything expect one block.

after this has been finised i rebooted and in the vSphere-Client the Name of the corrupted RAID5 appeared as second disk, while the first disk was completely unavailable as it was before - i'm not sure if i did press F1 at boot-time and this is the reason for this. anyway, now i'm working with this DD-Copy on the second RAID5. I could apply the Virtual-HDD on this RAID5 to the VM and boot again. unfortunately windows wanted to initialize the disk. i aborted the initalizing and started the newest testdisk version to figure out if there is a anywhere a functional partition-table. After a deeper search (finished this day in the morning after some selfmade interrupts in the last two days...) it has found some entries which contains the correct Partition-Name called "Daten" - this is german for data, but at the end of the deeper search it told me that it cannot restore this partition because the disk seems too small. the end-cylinder of the partition which has been found by testdisk is bigger than the last cylinder of the disk - if the data which testdisk displays is correct. so i added a partition manually with the start-values of the automatically founded partition and set the end-values to the last values of the disk. i did press P to display files of this partition and was very happy to see the content of our E:\ - i started to recover some small files for testing and then i was wondering about the date. all data which is displayed in testdisk has date-stamps from at least may 2012. some folders are missing completely. the folder which i recoverd for testing is called "remoteaccess" and contains some RDP-Connections and so on. testdisk shows me only old data, also data which has been deleted already. this old data can be restored and the data is usable also, but i want to have the new data-state.

it seems to me that the flat.vmdk is a kind of a snapshot - but iv'e never created one by myselfe. also, the virtual hdd was created as static.

my simple question now is: can somebody explain me, where this old data-state cames from and where the new one could be located?

as far as i don't understand this, i cannot continue the recovery-process without destroy more of the maybe still recoverable data.

right at the moment i do a backup of the state from may 2012 to an external usb-disk - better than nothing. but when this is finished, i'm not sure how to continue with the process.

and yes, shame on me, we don't have a complete backup of this data from this corrupted raid5, but the data is extreme important for us. it contains our complete exchange-database with all our mails and all opened tasks of our customers, it contains also customer-projects, hours with coded c++-stuff and so on... 😞

any hint or help would be great!

and please dont tell me, that i should open a case at vmware -  i will do that if there is no other way, but my experience tell's me, that if i cannot solve this by myselfe, the vmware-specialists will not be able also, this is the reason why i try this first by myselfe! if somebody could give me the guarantee that he would recover my complete lost data i would spent him another 300 euros and more!

thanks in advance

andi

0 Kudos
3 Replies
a_p_
Leadership
Leadership

Welcome to the Community,

with the importance of the data and the fac that you don't have a backup, I'd strongly recommend to call a data rescue company (e.g. Kroll-Ontrack) immediately and stop trying to do any self-repair. This could make things even worse.

André

0 Kudos
comfine
Contributor
Contributor

Sorry André, but this is not the answer to my question!

as kroll-ontrack-partner i know about their services, and i also know about their prices :-). but anyway, apart from the price the complete thing is not that easy - also for ontrack!

however, before i'll give them a try (if i have enough money to do that..i have to sleep some nights to decide this..) i want to understand one single thing: why do i find a data-state from may 2012 on my copy of the vmdk!

let me say one more thing: i do my work with a copy of the original vmdk, and till now without any! write-procedure to this copy of the vmdk - of course, i would never ever do some write-procedure to the original RAID5 since i know of the damage of this system! in addition the defective RAID5 is not in use at all since the DD process for creating the copy....and this - please correct me if you think my position is wrong - does not make anything worse. it's a safe way for a recovery-process.

Andi

0 Kudos
a_p_
Leadership
Leadership

Well, the only reason that I can think of why you are back to old data are snapshots (as you mentioned in your initial post). The snapshot information which is shown in the Snapshot Manager is maintained in the VMs .vmsd file and if this file is corrupted for any reason, the Snapshot Manager might not show any snapshots although they exist. If you are/were able to access any of the VM's old vmware.log files, check for <vmname>-00000x.vmdk files to find out whether snapshots were present.

André

0 Kudos