Solved: Datastore missing after raid array rebuild

MyCroWave · ‎02-29-2016

One of the drives in my raid array failed a few days ago.

I had a hot spare in the array so the controller immediately began the rebuild process and all servers remained running throughout.

When the rebuild was completed ( I verified this via the raid array user interface and the log), I needed to shut the server down to remove the failed drive (chassis not hot swappable) and when I re-booted the server, the data store on the raid array is no longer there. I also verified through the raid controller interface that I removed the proper (failed) drive and that the array was still in a Ready state when it came up.

In vsphere client, when I click on the Add Storage... link, the sever sees the hardware but if I click next it tells me that it will re-format the volume. See attached. I most definitely did not go that next step and reformat. I simply took the screen-shot and backed out.

I found these instructions, but they are for a much older version of ESXi and am not sure if they are correct for ESXi 6.0.0 338124

VMware KB: Datastore missing after rebuilding the RAID disk/LUN

Are these the steps that I should follow?

If these are not the right instructions can you point me to the version that is for ESXi 6.0.0 338124 as I have been unable to locate anything.

Thanks

MyCroWave · ‎03-05-2016

Hi ThompsG,

Yes, There were two data stores for the VM that were on the RAID array. The VM itself was stored on a different data store that was not in the raid array.

I spent about 48 hours over the past week, including this morning, trying to coax ESXi into recognizing the volumes, with no luck. Finally, I gave up and I removed the hard drives from the virtual machine that were on the corrupt data store. Then, the virtual machine came up without issue.

Finally, since I have everything on those 2 volumes backed up to a cloud provider, I re-created the two data stores in the raid array and began the restore process. It is currently running and has an estimated 16 days left to go.

View solution in original post

MyCroWave · ‎02-29-2016

Sorry all, I just noticed the instructions for ESXI 6.0 in an embedded link in the document that I linked to. Doh!

If that proves to be the correction, I will mark this as answered.

MyCroWave · ‎02-29-2016

On this page:

https://kb.vmware.com/selfservice/microsites/search.do?cmd=displayKC&externalId=1011387

It says the following, but when I get to step 6, the VMS label column is blank. I didn't go any further at that point because I am pretty new to VMWare at this level and would rather not lose the data if I can avoid it. Yes I have a backup in the cloud, but would need to order a restore hard drive shipped to me. If I can recover without having to resort to that it would be much preferred. Any additional guidance would be greatly appreciated.

ESXi 5.x and 6.x

vSphere Client

Log in to the vSphere Client and select the server from the inventory panel.
In the Hardware panel of Configuration tab, click Storage.
Click Add Storage.
Select the Disk/LUN storage type.
Click Next.
From the list of LUNs, select the LUN that has a datastore name displayed in the VMFS Label column.

Note: The name present in the VMFS Label column indicates that the LUN is a copy that contains a copy of an existing VMFS datastore.
Click Next.
Under Mount Options, these options are displayed:
- Keep Existing Signature: Persistently mount the LUN (for example, mount LUN across reboots)
- Assign a New Signature: Resignature the LUN
- Format the disk: Reformat the LUN
  
  Notes:
  - Format the disk option deletes any existing data on the LUN.
  - Before attempting to resignature, ensure that there are no virtual machines running off that VMFS volume on any other host, as those virtual machines become invalid in the vCenter Server inventory and they are to be registered again on their respective hosts.
Select the desired option for your volume.
In the Ready to Complete page, review the datastore configuration information.
Click Finish.

vSphere Web Client

Log in to the vSphere Web Client and navigate to vCenter Home.
Click Datastores in the menu on the left.
In the Objects tab click the Create a new datastore icon in the top left.
Type the datastore name and if required, select the placement location for the datastore.
Select VMFS as the datastore type.
From the list of storage devices, select the device that has a specific value displayed in the Snapshot Volume column.

Note: The value present in the Snapshot Volume column indicates that the device is a copy that contains a copy of an existing VMFS datastore.
Under Mount Options, select the desired option (Keep existing, Assign new, Format) for your volume and click Next.
Review the datastore configuration information.
Click Finish.

Command line

The esxcli command is used on the command line.

To list the volumes detected as snapshots, run this command:

# esxcli storage vmfs snapshot list

You see output similar to:

49d22e2e-996a0dea-b555-001f2960aed8 Volume Name: VMFS_1 VMFS UUID:49d22e2e-996a0dea-b555-001f2960aed8 Can mount: true Reason for un-mountability: Can resignature: true Reason for non-resignaturability: Unresolved Extent Count: 1
To mount a snapshot/replica LUN that is persistent across reboots, run this command:

# esxcli storage vmfs snapshot mount -l label|-u uuid

For example:

# esxcli storage vmfs snapshot mount -l "VMFS_1"
# esxcli storage vmfs snapshot mount -u "49d22e2e-996a0dea-b555-001f2960aed8"
To mount a snapshot/replica LUN that is not persistent across reboots, run this command:

# esxcli storage vmfs snapshot mount -n -l label|-u uuid

For example:

# esxcli storage vmfs snapshot mount -n -l "VMFS_1" # esxcli storage vmfs snapshot mount -n -u "49d22e2e-996a0dea-b555-001f2960aed8"
To resignature a snapshot/replica LUN (the volume is mounted immediately after the resignature), run this command:

# esxcli storage vmfs snapshot resignature -l label|-u uuid

For example:

# esxcli storage vmfs snapshot resignature -l "VMFS_1" # esxcli storage vmfs snapshot resignature -u "49d22e2e-996a0dea-b555-001f2960aed8"
To mount the volume without performing a resignaturing of that volume (this volume is mounted when the ESX host is rebooted), run this command:

# esxcfg-volume -M VMFS_UUID|label

For example:
# esxcfg-volume -M "VMFS_1"
# esxcfg-volume -M "49d22e2e-996a0dea-b555-001f2960aed8"

Note:

To view the datastores again in vCenter Server, you may have to perform a rescan of the storage adapters on all ESXi/ESX hosts that the datastore is presented to or a refresh of the storage view. If you are having trouble identifying the affected datastore, in the vSphere client, check the storage view of another ESX/ESXi host that still has the datastore mounted correctly. This will then allow you to correlate VMFS datastore name with NAA LUN identifier.

MyCroWave · ‎02-29-2016

Here are the errors that I get when I try to start the server that is stored on the RaidArray:

Power On virtual machine:6 (No such device or address)

See the error stack for details on the cause of this problem.

Time: 2/29/2016 11:30:43 PM

Target: FS

ESXi: 192.168.1.250

Error Stack

Failed to start the virtual machine.

Cannot open the disk '/vmfs/volumes/52b4931f-de0c19aa-d2b4-001e673dab32/FS/FS_2.vmdk' or one of the snapshot disks it depends on.

6 (No such device or address)

Module Disk power on failed.

Cannot open the disk '/vmfs/volumes/52b4931f-de0c19aa-d2b4-001e673dab32/FS/FS_1.vmdk' or one of the snapshot disks it depends on.

6 (No such device or address)

Thanks

continuum · ‎03-01-2016

The datastore seems to still have a VMFS partition - so the raid-rebuild probably did not fail completely.
Often the vmdks can still be extracted with the the help of a Linux-system using vmfs-fuse
If you want I can have a closer look.

________________________________________________
Do you need support with a VMFS recovery problem ? - send a message via skype "sanbarrow"
I do not support Workstation 16 at this time ...

MyCroWave · ‎03-01-2016

Hi continuum,

Thanks for the reply. I am encouraged by your optimism. The raid array shows that it was able to successfully rebuild the array. See attached screen-shots.

I mainly work on mid-range computer systems from IBM where the raid arrays are extremely reliable. When they say that they rebuild something, there is never a doubt and has never been an issue where it was not able to reliably rebuilt the array. Given that kind of expectation, if this array was not properly rebuilt, it sort of makes me wonder how the company that makes the card can get away with claiming that their card provides a RAID solution. But let me set my expectations aside and figure out how I can get this darn thing working again.

This is my personal VMWare installation, at home, and I sometimes get to work on it in the evenings, after I get home from work, but I mostly work on it on the weekends when I have more time and my head is clear.

Would you be willing to assist me at one of those times?

If that is not possible, I will try to arrange for something during business hours as it would be great if I can get this going again.

I do have quite a bit of technical skills (30 yrs in software development) but have very little technical expertise with the bowels of VMWare. Knowing that, if you think it won't waste too much of your time typing instructions through the thread, I would be open to trying that as well. In addition, perhaps the community would benefit from the solution at the same time.

Thanks again for your offer to help and I look forward to hearing from you.

ThompsG · ‎03-05-2016

Hi MyCroWave,

Don't want to get in the way of Continuum as he is definitely the expert here but thought I'd asks validation question.

You mention that you tried to power the VM on - does this mean part of the VM was on other disks or that you have managed to mount the datastore now but the VM doesn't start?

Kind regards.

MyCroWave · ‎03-05-2016

Hi ThompsG,

Yes, There were two data stores for the VM that were on the RAID array. The VM itself was stored on a different data store that was not in the raid array.

I spent about 48 hours over the past week, including this morning, trying to coax ESXi into recognizing the volumes, with no luck. Finally, I gave up and I removed the hard drives from the virtual machine that were on the corrupt data store. Then, the virtual machine came up without issue.

Finally, since I have everything on those 2 volumes backed up to a cloud provider, I re-created the two data stores in the raid array and began the restore process. It is currently running and has an estimated 16 days left to go.

ThompsG · ‎03-05-2016

Hi MyCroWave,

Feel your pain - at least you have a recovery point but ouch with the restore time 🙂

To be honest it is quite scary at the moment the number of people having data lost with ESXi - perhaps it is just my imagination but seems the trend is increasing - maybe it just that more people are turning to the community for help?

Have a great day,

ThompsG

All

Datastore missing after raid array rebuild

ESXi 5.x and 6.x