VMware Cloud Community
WildDoktor
Contributor
Contributor

How to recover a datastore after a disk failure?

I have an Adaptec RAID 71605Q and 8 hdds: 4x 2TB in a raid5 array, and 4x 3TB in a raid10 array. (Two connections on the card, and two of those 4-into-1 cables; one cable per array, if that makes sense.)

I'm running a standalone esxi 6.0.0 Update 2 (Build 3620759) host (I run esxi on a usb stick).

I have (had!) 2 datastores; one per array.

About a year ago, 2 drives in the raid10 array failed...I've been doing this for 29 years and have never had 2 drives in an array fail. I know others have, but I never have. Guess it's my turn!

Just for grins, I bought 2 more 3TB drives and plugged them in. It wasn't readily apparent how to recover, so I literally set it aside until the other day.

So now, I have 3 of the 4 original drives plus a new drive plugged in, and have an online / degraded array.

In esxi web client, I don't see that datastore (I can still see and use the raid5 datastore). I'd like to recover it if I can; I've spent a few hours combing the interwebs and still have had no luck, so maybe someone here can help!

Here are some commands I've run via putty, and the results:

esxcli storage vmfs snapshot list:

   Volume Name:

   VMFS UUID:

   Can mount: false

   Reason for un-mountability: some extents missing

   Can resignature: false

   Reason for non-resignaturability: some extents missing

   Unresolved Extent Count: 1

(It also shows my other datastore with no issues.)

esxcli storage core device list:

eui.2e73998e00d00000

   Display Name: Local ASR7160 Disk (eui.2e73998e00d00000)

   Has Settable Display Name: true

   Size: 5611518

   Device Type: Direct-Access

   Multipath Plugin: NMP

   Devfs Path: /vmfs/devices/disks/eui.2e73998e00d00000

   Vendor: ASR7160

   Model: datastore1

   Revision: V1.0

   SCSI Level: 2

   Is Pseudo: false

   Status: on

   Is RDM Capable: false

   Is Local: true

   Is Removable: false

   Is SSD: false

   Is VVOL PE: false

   Is Offline: false

   Is Perennially Reserved: false

   Queue Full Sample Size: 0

   Queue Full Threshold: 0

   Thin Provisioning Status: unknown

   Attached Filters:

   VAAI Status: unsupported

   Other UIDs: vml.01000000003865393937333265646174617374

   Is Shared Clusterwide: false

   Is Local SAS Device: false

   Is SAS: false

   Is USB: false

   Is Boot USB Device: false

   Is Boot Device: false

   Device Max Queue Depth: 256

   No of outstanding IOs with competing worlds: 32

   Drive Type: unknown

   RAID Level: unknown

   Number of Physical Drives: unknown

   Protection Enabled: false

   PI Activated: false

   PI Type: 0

   PI Protection Mask: NO PROTECTION

   Supported Guard Types: NO GUARD SUPPORT

   DIX Enabled: false

   DIX Guard Type: NO GUARD SUPPORT

   Emulated DIX/DIF Enabled: false

esxcfg-volume -l

UnresolvedVmfsVolume: Unable to find device in unresolved list:0#eui.2e73998e00d00000:1VMFS UUID/label: n.a./n.a.

Can mount: No (some extents missing)

Can resignature: No (some extents missing)

Extent name: eui.2e73998e00d00000:1     range: 962072674048 - 962072674303 (MB)

esxcli storage vmfs snapshot extent list

Volume Name  VMFS UUID  Extent Number  Device Name           Partition         Start           End

-----------  ---------  -------------  --------------------  ---------  ------------  ------------

                                    0  eui.2e73998e00d00000          1  962072674048  962072674303

partedUtil getptbl /vmfs/devices/disks/eui.2e73998e00d00000

gpt

715368 255 63 11492388864

1 128 11492388824 AA31E02A400F11DB9590000C2911D1B8 vmfs 0

partedUtil getptbl /vmfs/devices/disks/eui.2e73998e00d00000:1

unknown

715368 255 63 11492388697

esxcli storage core device partition list

Device                Partition  Start Sector   End Sector  Type           Size

--------------------  ---------  ------------  -----------  ----  -------------

mpx.vmhba32:C0:T0:L0          0             0     15826944     0     8103395328

mpx.vmhba32:C0:T0:L0          1            64         8192     0        4161536

mpx.vmhba32:C0:T0:L0          5          8224       520192     6      262127616

mpx.vmhba32:C0:T0:L0          6        520224      1032192     6      262127616

mpx.vmhba32:C0:T0:L0          7       1032224      1257472    fc      115326976

mpx.vmhba32:C0:T0:L0          8       1257504      1843200     6      299876352

mpx.vmhba32:C0:T0:L0          9       1843200      7086080    fc     2684354560

eui.2e73998e00d00000          0             0  11492388864     0  5884103098368

eui.2e73998e00d00000          1           128  11492388825    fb  5884103012864

eui.ae00911700d00000          0             0  11702087680     0  5991468892160

eui.ae00911700d00000          1           128  11702087641    fb  5991468806656

df -h

Filesystem   Size   Used Available Use% Mounted on

VMFS-5       5.4T   4.2T      1.3T  77% /vmfs/volumes/Datastore2-3Gbps

vfat       285.8M 202.6M     83.2M  71% /vmfs/volumes/5898d802-31df4d42-c887-001b214c3280

vfat       249.7M 168.7M     81.0M  68% /vmfs/volumes/80d144e6-f6071fe9-239e-81751ede2f2f

vfat       249.7M 168.7M     81.0M  68% /vmfs/volumes/cb65cbe6-73ef9335-efb9-ecd97551455f

Reply
0 Kudos
4 Replies
continuum
Immortal
Immortal

> Can mount: No (some extents missing)

Did you ever expand that datastore by adding extents ? Really bad idea

Anyway - read my instructions

Create a VMFS-Header-dump using an ESXi-Host in production | VM-Sickbay

Create dumps like that and call me next week

Ulli


________________________________________________
Do you need support with a VMFS recovery problem ? - send a message via skype "sanbarrow"
I do not support Workstation 16 at this time ...

Reply
0 Kudos
WildDoktor
Contributor
Contributor

I never intentionally added an extent to that datastore. In my raid card bios (before starting esxi), I created two virtual drives (the raid5 drive and the raid10 drive). In esxi, I created two datastores; one on each virtual drive.

I've never had someone on these forums ask me to call them. Are you charging for your services?

Reply
0 Kudos
continuum
Immortal
Immortal

> I've never had someone on these forums ask me to call them.

Yes - as far as I know I am the only one that does that.
But I am also the only one that will look into VMFS-corruption issues at all.

But if you search this forum for a few minutes you will find that questions like yours in most of the cases can not be solved with a quick exchange of questions and answers.

Usually I ask for a dump of the Metadata of a VMFS-volume and then I spend about 2 -3 hours on analysing the metadata and eventually come up with some scripts to extract your lost data.

In about 50 % of the cases I also need a remote session to do the critical steps myself.

> Are you charging for your services?

I am an idiot and only ask for donations - thats why I had to add the second line to my signature.
And by the way - you will not find a single case where a forum user that I invited to call me had any reasons to complain later ...

Hope that answers your question.

Ulli


________________________________________________
Do you need support with a VMFS recovery problem ? - send a message via skype "sanbarrow"
I do not support Workstation 16 at this time ...

Reply
0 Kudos
tjsampair
Contributor
Contributor

continuum​ I have a spanned datastore volume that is comprised of 4 extents and I suffered a disk array failure that forced me to re-import the disk volume on a very old PERC 5/I disk controller. I was able to recover the original disk array, but it was assigned a new disk NAA number which is causing the VMFS-5 datastore not to recognize it. I am not sure what tools to use to update the datastore to replace the OLD NAA disk with the new NAA disk to provide for the missing extent.

[root@Pegasus:~] vmkfstools -Ph /vmfs/volumes/datastore1\ \(1\)/

VMFS-5.54 file system spanning 4 partitions.

File system label (if any): datastore1 (1)

Mode: public

Capacity 7.3 TB, 917.9 GB available, file block size 1 MB, max supported file size 62.9 TB

UUID: 51f563b0-25ee3451-895e-00188b440eff

Partitions spanned (on "lvm"):

        naa.600188b0436047001987f1834fc6754b:3

        naa.600188b0436047001987f1dce0ed5c8c:1

        (device naa.600188b0436047001987f219956c8e6e:1 might be offline)

        naa.600188b04360470019997bb04b041de4:1

        (One or more partitions spanned by this volume may be offline)

Is Native Snapshot Capable: YES

The drive that contains an extent "naa.600188b0436047001987f219956c8e6e:1" is the old disk array and the new disk array was assigned in as "naa.600188b043604700256a4374a7dd9376:1".

What are the steps to update the datastore to replace the old NAA disk with the new NAA disk?

Reply
0 Kudos