We recently had a power outage and upon restore of power and start of ESXi (6.5) a number of VM's now show as inaccessible in vSphere Client:
When I SSH into the server and list the volumes I do not see that particular volume anywhere. All of the disks are active as per usual.
I am stuck at this point. Not quite sure where to go since I can't even find the volume for the particular VM's as listed in the vSphere Client.
Hi there.
Do you know in which datastore were this VMs located before the outage? Maybe an "esxcli storage core adapter rescan" will bring the datastore back.
You need to find the VMX files of the Inaccessible VMs, so try searching them in other datastores. You are well located in the VMFS folder. Maybe a grep would help you to find the VMs files, or maybe searching from the UI "browse datastore" view
it has been a few years since those VMs were set up, but I believe I have an idea which one they were on (BDGI3). I did a rescan from VSphere Client (RescalAll HBAs & RescanVMFS), but no change.
I did run a search in the /volumes folder for all *.vmx. I do not see any of the ones that show up as inaccesible there I used the find command in SSH to do this. It is odd to me that the volume that shows in the first image: 554cb8a-2f526e1f-acb0-60eb6920d734 is nowhere to be found.
**EDIT** No drives have gone missing during this outage, so I would expect to see that volume under vmfs/volumes
Please run vmkfstools -V (Please note that it is an upper-case -V) from the host's command line to do a rescan, and then check the vmkernel.log to see whether it logs helpful entries.
André
Here are the only messages I get in the vmkernel.log file after running that command:
Doesn't seem like a whole lot to go on
hey, hope you are doing fine?
Are you using SAN/FC, NAS or iSCSI?
how are the vmhbas of the host?
ISCSI for the 2 (Diskstation / ReadyNAS), the rest are internal drives on the server.
I pulled all of the drives (none showing error) and loaded DiskInternals Recovery tool. Then mounted them individually in an Insignia USB Disk Docking station (whatever it is called lol). Looks like one of the drives showed up properly and one did not. I think somehow the file system is corrupted because it shows like this:
The BDGI matches with the server. There should actually be a BDGI4 (which I had forgotten about). That one is now showing up as the one in yellow above. All other drives show up properly as BDGI1, BDGI3, and BDGISSD1 when mounted in the docking station.
So - I'm hoping this tool can recover something. But, are there other options?
well, I went a different route to try to recover data but still at a loss
I did the following:
I initially tried just vmfs-fuse, but wasn't 100% sure it was a VMFS version 5 volume, so I tried 6. You can see that neither works And - when I try then, then the /dev/sdb1 seems to disappear.
I REALLY need to get the data off here, but it seems like ESXi corrupted something. Any other ideas?
Hi
have you tried with vmfs-tools for vmfs 5 ?
Anyway - if the data is important dump the first 2 gb of /dev/sdb to file and provide a download.
dd if=/dev/sdb of=amueller.2048 bs=1M count=2048
How did the results from Diskinternals look like ?
Ulli
I tried vmfs-tools for 5 first (before the image was created), but you can see my last command was using version 5 (or I believe it was anyway). To get the original set of tools on there I just ran a
$ sudo apt-get install -y vmfs-tools
After I couldn't get that to work, I did the same command except vmfs6-tools. Then I ran vmfs6-fuse (instead of vmfs-fuse). So - i think I've tried both... but I'm a windows guy mostly, so I barely know what I am doing with Linux.
The data is definitely important. Very.
I ran the command you gave. Seems to take a long time (similar to when I mount the drive). During that time the USB docker LED for the drive will turn off (no idea why). Once this happens, though, if I run fdisk -l, it won't show /dev/sdb
The command produced:
dd: error reading '/dev/sdb': Input/output error
360+1 records in
360+1 records out
377749504 bytes (378 MB, 370 MiB) copied, 273.291 s, 1.4 MB/s
DiskInternals sometimes would show the drive as BDGI4 (as expected) and sometimes would not. When I ran through the search, it always came back with Recovered folder empty.
Also - I do have the file, just not sure what to do with it
Back to trying more recovery today - any other thoughts on how I can recover the data on the VMFS that somehow got messed up (corrupted?) after a power outage?
When you created the vmfs-header dump from Linux where did you tried to store it ?
Do you have a second datastore - one that is healthy ?
If you want - call me via Skype so that we can do a Teamviewer-session and discuss the options
It stored locally, then I moved it over to my Windows machine. I have 4 other healthy datastores, yes.
I would skype, but my wife pulled me from my computer to help work outside. I will likely have some time tomorrow. Right now I'm running Disk Internals VMFS Recovery again. It is odd, but the drive didn't show up at all for a bit... then after about 10 minutes I heard it start spinning, so I re-opened VMFS Recovery. it is working on it now and shows 26 Folders, 339 Files. So - I am hoping that once it completes i will have something to recover.
well, VMFS Recovery by DiskInternals found all the files and folders. However - I didn't realize it would cost $1599 to license it and recover all my files
So - my next step was to try CloneZilla and attempt to clone the disk (minus the bad sectors). It is running, but not 100% convinced it will work. The reason is to make sure I don't accidentally ruin the original.
I set it up in advanced mode, set up rescue mode, and set it to NOT fix bad sectors. It is running now, Ran a pretty good clip, but now it is just scrolling through bad sector after bad sector for the last hour. Because, when this happens, PartClone (part or CloneZilla) seems to write all over the screen - it is difficult to read. From what I can see, I have about 70 hours remaining I guess I will just let it ride out.
My only other thought was to use dd_rescue to try to run a similar routine.