6 Replies Latest reply on Apr 24, 2019 8:28 AM by continuum

    Disk recovery, help!

    zhanglin999 Novice
    VMware Employees

      Hi,

       

      After a power failure, one datastore become disconnected in our esxi host. It's raid5 consists of 3 disks. I book with an Ubuntu live CD and I can get following information:

       

      ubuntu@ubuntu:~$ sudo fdisk -l /dev/sde

      Disk /dev/sde: 3.3 TiB, 3599451029504 bytes, 7030177792 sectors

      Units: sectors of 1 * 512 = 512 bytes

      Sector size (logical/physical): 512 bytes / 512 bytes

      I/O size (minimum/optimal): 512 bytes / 512 bytes

      Disklabel type: gpt

      Disk identifier: 543D936A-AE82-4ABE-A2FB-ED296E861BC1

       

       

      Device     Start        End    Sectors  Size Type

      /dev/sde1   2048 7030177758 7030175711  3.3T VMware VMFS

       

      but when I try to mount with vmfs-fuse it outputs following error:

      ubuntu@ubuntu:~$ sudo vmfs-fuse /dev/sde1 /mnt/vmfs

      VMFS VolInfo: invalid magic number 0x00000000

      VMFS: Unable to read volume information

      Trying to find partitions

      Unable to open device/file "/dev/sde1".

      Unable to open filesystem

       

      Is it possiblt to get my data back? continuum, I see you help so many guys for similar issue, so try to @you to see if you can help :-)

        • 1. Re: Disk recovery, help!
          continuum Guru
          Community WarriorsUser ModeratorsvExpert

          hi
          Do you still have the system booted into Linux ?
          Then please dump the first 1500 mb of sde to a file
          dd if=/dev/sde bs=1M count=1536 of=zhanglin.1536
          Compress that file and provide a download-link,
          The dump will show me if your raid5 is still healthy.

          • 2. Re: Disk recovery, help!
            zhanglin999 Novice
            VMware Employees

            This is the link in onedrive,

            https://onevmw-my.sharepoint.com/:u:/g/personal/zzhou_vmware_com/ERNRFZsbmCxOmOyUtX9BhIoBcvbWXDjrKYOPSKmClFv72w?e=NuYOMF

             

            I dump the /dev/sde1 in this file. If /dev/sde is required, I will dump it and upload it again.

             

            Many thanks!

            • 3. Re: Disk recovery, help!
              continuum Guru
              User ModeratorsvExpertCommunity Warriors

              To anybody reading this post ....
              This is a typical example why I tell my customers that the combination:
              - local storage +

              - Raid5 +
              - VMFS +
              - unreliable powersupply

              ------------------------------------
              = unacceptable risk.
              @ zhanglin
              In the current state the VMFS is garbage.
              The VMFS magic number is not present at the expected offset.
              The pointers to the hidden .sf files are not at the expected offset.
              Apparently the raid-controller could not handle the power failure correctly.
              Only solution that I can suggest is to use the disks without the Raid-controller.
              Then you either use a commercial tool to build a virtual Raid-array or use Linux to setup a software Raid.
              If you are lucky - and if you can query the existing Raid-controller for the necessary parameters like stripe-size and so on it may be possible to re-assemble the array.
              At first sight the data seems to be still there: the vmdk-descriptorfiles for example are present and can be extracted as expected.
              VMDK descriptorfiles appear to be smaller than the used stripesize so they come in one piece.
              VMX-files - larger than vmdk--descriptors - are already incomplete - which is a sign that the Raid no longer has the correct structure.
              The best you can get from this array in its current state is files with a size of a few kb.
              Everything larger than that is unusable.
              So the next thing to do - if the data is important - is to use the raid-disks one by one and try to build a virtual raid which hopefully is healthy enough to create dd-scripts for the vmdk-files.

              • 4. Re: Disk recovery, help!
                gregsn Enthusiast

                In my experience, VMFS can be used safely as local storage when:

                • The RAID controller MUST have protected write cache (preferred capacitor based).
                • If the RAID controller does not have protected write caching ability, write caching MUST BE DISABLED.
                • Caching on the disk level MUST be disabled (usually set at the controller level).
                • Use top-tier RAID controller brands (eg. Adaptec/Microsemi or LSI) that have the ability to properly implement the above.

                 

                I can't vouch for other brands since I've only had first hand experience with Adaptec or LSI controllers and have never suffered VMFS loss after a power outage (or any other unexpected/dirty shutdown) in the above configuration.

                 

                I also recommend (not a hard requirement to protect VMFS in a dirty shutdown scenario) RAID6 or 60 as a minimum RAID configuration (ie. at least dual-disk failure ability for any data stored on the array).

                 

                It would be interesting to know what the configuration the OP had to create this failure so it can be avoided in the future.

                 

                PS:

                The only time I've lost a VMFS using the above scenario is not during a power outage but due to either a disk sending faulty data back to the controller (eg. a "lower brand" SSD sending garbage data back to the controller) or data corruption during RAID 6 rebuild (a bug I personally confirmed on older Adaptec firmware that will corrupt a RAID 6 volume during a rebuild with >2TB disks).  With that said, I now only use Intel brand SSDs which have capacitor backed protected write caching have had zero issues since.

                • 5. Re: Disk recovery, help!
                  zhanglin999 Novice
                  VMware Employees

                  This is really sad news :-(

                  We starts to try DiskInternals today but it's not completed yet. If there is any progress, I will update it here to let you know.

                  Anyway, I do appreciate your help!

                   

                  --zhanglin

                  • 6. Re: Disk recovery, help!
                    continuum Guru
                    vExpertCommunity WarriorsUser Moderators

                    I would not expect results with Diskinternals or UFSexplorer in the current state.
                    Both tools can do a good job with a slightly corrupted VMFS-volume - but at the moment the content of the area reserved for the metadata does not qualify as valid VMFS.
                    So you are basically starting a raw scan looking for filesignatures.
                    I expect that you may even find some promising pieces but I doubt that you will find any useable file larger than a few hundred KBs.
                    Anyway - please keep us updated.

                    Ulli