VMware Cloud Community
amueller
Contributor
Contributor

Inaccessible VM's after power outage, volume id missing

We recently had a power outage and upon restore of power and start of ESXi (6.5) a number of VM's now show as inaccessible in vSphere Client:

Inaccessible - vSphere Client.png

When I SSH into the server and list the volumes I do not see that particular volume anywhere.  All of the disks are active as per usual.

Volumes - PuTTY.png

I am stuck at this point.  Not quite sure where to go since I can't even find the volume for the particular VM's as listed in the vSphere Client.

Reply
0 Kudos
16 Replies
lucasbernadsky
Hot Shot
Hot Shot

Hi there.

Do you know in which datastore were this VMs located before the outage? Maybe an "esxcli storage core adapter rescan" will bring the datastore back.

You need to find the VMX files of the Inaccessible VMs, so try searching them in other datastores. You are well located in the VMFS folder. Maybe a grep would help you to find the VMs files, or maybe searching from the UI "browse datastore" view

Reply
0 Kudos
amueller
Contributor
Contributor

it has been a few years since those VMs were set up, but I believe I have an idea which one they were on (BDGI3).  I did a rescan from VSphere Client (RescalAll HBAs & RescanVMFS), but no change.

I did run a search in the /volumes folder for all *.vmx.   I do not see any of the ones that show up as inaccesible there Smiley Sad  I used the find command in SSH to do this.  It is odd to me that the volume that shows in the first image:  554cb8a-2f526e1f-acb0-60eb6920d734 is nowhere to be found.

**EDIT** No drives have gone missing during this outage, so I would expect to see that volume under vmfs/volumes

Reply
0 Kudos
a_p_
Leadership
Leadership

Please run vmkfstools -V (Please note that it is an upper-case -V) from the host's command line to do a rescan, and then check the vmkernel.log to see whether it logs helpful entries.


André

Reply
0 Kudos
amueller
Contributor
Contributor

Here are the only messages I get in the vmkernel.log file after running that command:

vmkernel.log - PuTTY.png

Doesn't seem like a whole lot to go on Smiley Sad

Reply
0 Kudos
nachogonzalez
Commander
Commander

hey, hope you are doing fine?

Are you using SAN/FC, NAS or iSCSI?

how are the vmhbas of the host?

Reply
0 Kudos
amueller
Contributor
Contributor

ISCSI for the 2 (Diskstation / ReadyNAS), the rest are internal drives on the server.

I pulled all of the drives (none showing error) and loaded DiskInternals Recovery tool.  Then mounted them individually in an Insignia USB Disk Docking station (whatever it is called lol).  Looks like one of the drives showed up properly and one did not.  I think somehow the file system is corrupted because it shows like this:

Disks-Mounted.png

The BDGI matches with the server.  There should actually be a BDGI4 (which I had forgotten about).  That one is now showing up as the one in yellow above.  All other drives show up properly as BDGI1, BDGI3, and BDGISSD1 when mounted in the docking station.

So - I'm hoping this tool can recover something.  But, are there other options?

Reply
0 Kudos
amueller
Contributor
Contributor

well, I went a different route to try to recover data but still at a loss Smiley Sad 

I did the following:

  • Loaded ubuntu on my laptop
  • Installed vmfs-tools
  • Hooked up the USB docker with the drive in it
  • Terminal Commands to show the volume
    • sudo fdisk -l  (to find the disk, which was at /dev/sdb
    • sudo fdisk -l  /dev/sdb

SDB1 Mounted.png

  • Terminal Commands to try to mount the drive:
    • sudo vmfs-fuse /dev/sdb1 /mnt/vmfs

Mount.png

I initially tried just vmfs-fuse, but wasn't 100% sure it was a VMFS version 5 volume, so I tried 6.  You can see that neither works Smiley Sad  And - when I try then, then the /dev/sdb1 seems to disappear.

I REALLY need to get the data off here, but it seems like ESXi corrupted something.  Any other ideas?

Reply
0 Kudos
continuum
Immortal
Immortal

Hi

have you tried with vmfs-tools for vmfs 5 ?

Anyway - if the data is important dump the first 2 gb of /dev/sdb to file and provide a download.

dd if=/dev/sdb of=amueller.2048 bs=1M count=2048

How did the results from Diskinternals look like ?

Ulli


________________________________________________
Do you need support with a VMFS recovery problem ? - send a message via skype "sanbarrow"
I do not support Workstation 16 at this time ...

Reply
0 Kudos
amueller
Contributor
Contributor

I tried vmfs-tools for 5 first (before the image was created), but you can see my last command was using version 5 (or I believe it was anyway).  To get the original set of tools on there I just ran a

$ sudo apt-get install -y vmfs-tools

After I couldn't get that to work, I did the same command except vmfs6-tools.  Then I ran vmfs6-fuse (instead of vmfs-fuse).  So - i think I've tried both... but I'm a windows guy mostly, so I barely know what I am doing with Linux.

The data is definitely important.  Very.

I ran the command you gave.  Seems to take a long time (similar to when I mount the drive).  During that time the USB docker LED for the drive will turn off (no idea why).  Once this happens, though, if I run fdisk -l, it won't show /dev/sdb

The command produced:

dd:  error reading '/dev/sdb': Input/output error

360+1 records in

360+1 records out

377749504 bytes (378 MB, 370 MiB) copied, 273.291 s, 1.4 MB/s

DiskInternals sometimes would show the drive as BDGI4 (as expected) and sometimes would not.  When I ran through the search, it always came back with Recovered folder empty.

Reply
0 Kudos
amueller
Contributor
Contributor

Also - I do have the file, just not sure what to do with it Smiley Happy 

Reply
0 Kudos
amueller
Contributor
Contributor

Back to trying more recovery today - any other thoughts on how I can recover the data on the VMFS that somehow got messed up (corrupted?) after a power outage?

Reply
0 Kudos
continuum
Immortal
Immortal

When you created the vmfs-header dump from Linux where did you tried to store it ?

Do you have a second datastore - one that is healthy ?


________________________________________________
Do you need support with a VMFS recovery problem ? - send a message via skype "sanbarrow"
I do not support Workstation 16 at this time ...

Reply
0 Kudos
continuum
Immortal
Immortal

If you want - call me via Skype so that we can do a Teamviewer-session and discuss the options


________________________________________________
Do you need support with a VMFS recovery problem ? - send a message via skype "sanbarrow"
I do not support Workstation 16 at this time ...

Reply
0 Kudos
amueller
Contributor
Contributor

It stored locally, then I moved it over to my Windows machine.  I have 4 other healthy datastores, yes. 

I would skype, but my wife pulled me from my computer to help work outside. I will likely have some time tomorrow.  Right now I'm running Disk Internals VMFS Recovery again.  It is odd, but the drive didn't show up at all for a bit... then after about 10 minutes I heard it start spinning, so I re-opened VMFS Recovery.  it is working on it now and shows 26 Folders, 339 Files.  So - I am hoping that once it completes i will have something to recover.

Reply
0 Kudos
amueller
Contributor
Contributor

well, VMFS Recovery by DiskInternals found all the files and folders.  However - I didn't realize it would cost $1599 to license it and recover all my files Smiley Sad

Reply
0 Kudos
amueller
Contributor
Contributor

So - my next step was to try CloneZilla and attempt to clone the disk (minus the bad sectors).  It is running, but not 100% convinced it will work.  The reason is to make sure I don't accidentally ruin the original.

I set it up in advanced mode, set up rescue mode, and set it to NOT fix bad sectors.  It is running now, Ran a pretty good clip, but now it is just scrolling through bad sector after bad sector for the last hour.  Because, when this happens, PartClone (part or CloneZilla) seems to write all over the screen - it is difficult to read. From what I can see, I have about 70 hours remaining Smiley Sad  I guess I will just let it ride out.

My only other thought was to use dd_rescue to try to run a similar routine. 

Reply
0 Kudos