VMware Cloud Community
Charles123987
Contributor
Contributor

Datastore unavailable after RAID crash/rebuild

I had an ESXi 6.5 server on a Dell R520 with 8-disk RAID6 ( 8 x 3TB, 18TB)

As mentioned in several threads here, following a RAID crash and drive rebuild, the only datastore disappeared.

esxcli storage vmfs snapshot list

53fc9638-dab159ce-4eea-c81f66b9a264

   Volume Name: datastore1

   VMFS UUID: 53fc9638-dab159ce-4eea-c81f66b9a264

   Can mount: false

   Reason for un-mountability: some extents missing

   Can resignature: false

   Reason for non-resignaturability: some extents missing

   Unresolved Extent Count: 2

partedUtil getptbl /dev/disks/naa.690b11c00222400024dabd24064e6018

gpt

2188400 255 63 35156656128

1 64 8191 C12A7328F81F11D2BA4B00A0C93EC93B systemPartition 128

5 8224 520191 EBD0A0A2B9E5443387C068B6B72699C7 linuxNative 0

6 520224 1032191 EBD0A0A2B9E5443387C068B6B72699C7 linuxNative 0

7 1032224 1257471 9D27538040AD11DBBF97000C2911D1B8 vmkDiagnostic 0

8 1257504 1843199 EBD0A0A2B9E5443387C068B6B72699C7 linuxNative 0

9 1843200 7086079 9D27538040AD11DBBF97000C2911D1B8 vmkDiagnostic 0

2 7086080 15472639 EBD0A0A2B9E5443387C068B6B72699C7 linuxNative 0

3 15472640 35156656094 AA31E02A400F11DB9590000C2911D1B8 vmfs 0

esxcfg-volume -l

VMFS UUID/label: 53fc9638-dab159ce-4eea-c81f66b9a264/datastore1

Can mount: No (some extents missing)

Can resignature: No (some extents missing)

Extent name: naa.690b11c00222400024dabd24064e6018:3     range: 0 - 17036287 (MB)

Extent name: naa.690b11c00222400024dabd24064e6018:3     range: 962072674048 - 962072674303 (MB)

Mounting partition #3 via linux/vmfs-tools was successful, and I've been able to copy some of the smaller VMs to a different ESXi server.  However, there is a 17TB VM that I would prefer to be able to run P2V Converter on it ( since it contains a large filesystem that is mostly empty at this time ).  It consists of 7 .vmdk files , 5 x 2TiB and 2 x 3TB, and the VM has all the drives combined via LVM into a single large volume group.

The server had been originally installed as a 12TB RAID6 (under ESXi 5.5), but the RAID had been grown twice (I think), to 15TB and then 18TB, and the Datastore had been grown twice as well.

I've been unable to figure out how to mount the Datastore inside ESXi.  I tried the GUI, but it insists it has to reformat the volume, so I do not proceed.

Any suggestions for getting the data off?  Some way to mount the datastore inside ESXi?

Or perhaps I can loopback mount the .vmdk files (via linux/vmfs-tools), reconstruct the volume group, and mount the oversize ext4 filesystem?

Reply
0 Kudos
15 Replies
continuum
Immortal
Immortal

> Reason for un-mountability: some extents missing

If I read your post correctly you expanded the datastore twice - using extents.

Does that mean that you had created a 12 tb array plus two more 3tb arrays ?

Are those 2 3tb volumes still present ?
> Mounting partition #3 via linux/vmfs-tools was successful,

That is good but that will only help for VMDKs that are located completely inside the 12 TB - the "parent volume"

To read anything else you need those other 2 volumes.

Anyway - I think I can create dd scripts to manually extract the large vmdks - but that depends on the availabilty of the missing extents.

Please read Create a VMFS-Header-dump using an ESXi-Host in production | VM-Sickbay

I need a dump like that for all of the extents.

By the way - learn your lesson here and stop using extents to expand a VMFS-datastore.
5 x 2TiB and 2 x 3TB combined to a 17 TB LVM ? - the admin who created that must be a very brave man - or a saboteur ...

Well - I think I can help you - if the 2 missing extents are available.

Ulli


________________________________________________
Do you need support with a VMFS recovery problem ? - send a message via skype "sanbarrow"
I do not support Workstation 16 at this time ...

Reply
0 Kudos
Charles123987
Contributor
Contributor

I don't completely remember exactly how I enlarged the datastore from inside ESXi (it was several years ago), but what I think I did was:

1) Initially, the RAID was formatted as a single 12TB GPT volume, with a small, perhaps 8GB partition for the ESXi software, and the rest as Datastore1

2) When it came time to enlarge the array, I shutdown and booted into the BIOS PERC710 configuration utility and I added the new 3TB drive to the RAID6 array.

3) I let it grow the array in the background for two or three days while continuing to use ESXi and the VM clients on that host.

4) When I saw that the raid reconstruction had completed I rebooted, and told ESXi through the GUI to use the new available space in the Datastore.

I definitely didn't look at the GPT partitions at the time, so I have no idea if there were additional partitions, or if everything was in GPT partition 3 as it appears now.

gpt

2188400 255 63 35156656128

1 64 8191 C12A7328F81F11D2BA4B00A0C93EC93B systemPartition 128

5 8224 520191 EBD0A0A2B9E5443387C068B6B72699C7 linuxNative 0

6 520224 1032191 EBD0A0A2B9E5443387C068B6B72699C7 linuxNative 0

7 1032224 1257471 9D27538040AD11DBBF97000C2911D1B8 vmkDiagnostic 0

8 1257504 1843199 EBD0A0A2B9E5443387C068B6B72699C7 linuxNative 0

9 1843200 7086079 9D27538040AD11DBBF97000C2911D1B8 vmkDiagnostic 0

2 7086080 15472639 EBD0A0A2B9E5443387C068B6B72699C7 linuxNative 0

3 15472640 35156656094 AA31E02A400F11DB9590000C2911D1B8 vmfs 0

VMFS tools is able to mount the partition Ok - probably because I originally created the Datastore under ESXi 5.5, and only upgraded to 6.5 later, although it gives an error on mount:

# vmfs-fuse -r /dev/sdb3 /vmfs

ioctl: Inappropriate ioctl for device

ioctl: Inappropriate ioctl for device

# df /vmfs

Filesystem            1K-blocks        Used Available Use% Mounted on

/dev/fuse           17570463744 17174385664 396078080  98% /vmfs

Reply
0 Kudos
Charles123987
Contributor
Contributor

I can post the VMFS header for that partition for you if you want to examine it.

Reply
0 Kudos
continuum
Immortal
Immortal

vmfs-fuse -r /dev/sdb3 /vmfs

is not enough.

You must also specify the missing extents - so you will need something like

vmfs-fuse /dev/sdb3 /dev/sdc1 /dev/sdd1 /vmfs

Please list the available disks / partitions


________________________________________________
Do you need support with a VMFS recovery problem ? - send a message via skype "sanbarrow"
I do not support Workstation 16 at this time ...

Reply
0 Kudos
Charles123987
Contributor
Contributor

I've listed the GPT partition list twice.  There are no apparent missing extents, although the partition numbers go 1,2,3,5-9 , skipping #4.

When I mount the single partition /dev/sdb3 using vmfs-fuse it doesn't complain about any missing extents, and I was successfully able to copy off about 800GB so far from that mount.

I'll list it again:

[root@westeros:~] partedUtil getptbl /dev/disks/naa.690b11c00222400024dabd24064e6018

gpt

2188400 255 63 35156656128

1 64 8191 C12A7328F81F11D2BA4B00A0C93EC93B systemPartition 128

5 8224 520191 EBD0A0A2B9E5443387C068B6B72699C7 linuxNative 0

6 520224 1032191 EBD0A0A2B9E5443387C068B6B72699C7 linuxNative 0

7 1032224 1257471 9D27538040AD11DBBF97000C2911D1B8 vmkDiagnostic 0

8 1257504 1843199 EBD0A0A2B9E5443387C068B6B72699C7 linuxNative 0

9 1843200 7086079 9D27538040AD11DBBF97000C2911D1B8 vmkDiagnostic 0

2 7086080 15472639 EBD0A0A2B9E5443387C068B6B72699C7 linuxNative 0

3 15472640 35156656094 AA31E02A400F11DB9590000C2911D1B8 vmfs 0

[root@westeros:~] ls -l '/dev/disks/naa.690b11c00222400024dabd24064e6018:3'

-rw-------    1 root     root     17992285928960 Aug  7 12:51 /dev/disks/naa.690b11c00222400024dabd24064e6018:3

[root@westeros:~] ls -lh /dev/disks/

total 35189320592

-rw-------    1 root     root       16.4T Aug  7 14:27 naa.690b11c00222400024dabd24064e6018

-rw-------    1 root     root        4.0M Aug  7 14:27 naa.690b11c00222400024dabd24064e6018:1

-rw-------    1 root     root        4.0G Aug  7 14:27 naa.690b11c00222400024dabd24064e6018:2

-rw-------    1 root     root       16.4T Aug  7 14:27 naa.690b11c00222400024dabd24064e6018:3

-rw-------    1 root     root      250.0M Aug  7 14:27 naa.690b11c00222400024dabd24064e6018:5

-rw-------    1 root     root      250.0M Aug  7 14:27 naa.690b11c00222400024dabd24064e6018:6

-rw-------    1 root     root      110.0M Aug  7 14:27 naa.690b11c00222400024dabd24064e6018:7

-rw-------    1 root     root      286.0M Aug  7 14:27 naa.690b11c00222400024dabd24064e6018:8

-rw-------    1 root     root        2.5G Aug  7 14:27 naa.690b11c00222400024dabd24064e6018:9

lrwxrwxrwx    1 root     root          36 Aug  7 14:27 vml.0200000000690b11c00222400024dabd24064e6018504552432048 -> naa.690b11c00222400024dabd24064e6018

lrwxrwxrwx    1 root     root          38 Aug  7 14:27 vml.0200000000690b11c00222400024dabd24064e6018504552432048:1 -> naa.690b11c00222400024dabd24064e6018:1

lrwxrwxrwx    1 root     root          38 Aug  7 14:27 vml.0200000000690b11c00222400024dabd24064e6018504552432048:2 -> naa.690b11c00222400024dabd24064e6018:2

lrwxrwxrwx    1 root     root          38 Aug  7 14:27 vml.0200000000690b11c00222400024dabd24064e6018504552432048:3 -> naa.690b11c00222400024dabd24064e6018:3

lrwxrwxrwx    1 root     root          38 Aug  7 14:27 vml.0200000000690b11c00222400024dabd24064e6018504552432048:5 -> naa.690b11c00222400024dabd24064e6018:5

lrwxrwxrwx    1 root     root          38 Aug  7 14:27 vml.0200000000690b11c00222400024dabd24064e6018504552432048:6 -> naa.690b11c00222400024dabd24064e6018:6

lrwxrwxrwx    1 root     root          38 Aug  7 14:27 vml.0200000000690b11c00222400024dabd24064e6018504552432048:7 -> naa.690b11c00222400024dabd24064e6018:7

lrwxrwxrwx    1 root     root          38 Aug  7 14:27 vml.0200000000690b11c00222400024dabd24064e6018504552432048:8 -> naa.690b11c00222400024dabd24064e6018:8

lrwxrwxrwx    1 root     root          38 Aug  7 14:27 vml.0200000000690b11c00222400024dabd24064e6018504552432048:9 -> naa.690b11c00222400024dabd24064e6018:9

Reply
0 Kudos
Charles123987
Contributor
Contributor

Perhaps you are unfamiliar with the normal Dell PERC raid controller effect of 'enlarging' a RAID.

The volume just becomes larger, with all the data/partitions on it remaining in the lower sectors of the virtual drive.  No new drive devices or disk partitions are created.

I don't know/remember how ESXi handled it when I told it to add the new data to the datastore.

Reply
0 Kudos
continuum
Immortal
Immortal

If ESXi complains about 2 missing extents - there should be either 2 more disks - or 2 more partitions (very unlikely)
Please upload the header dump - then I can give you more details for the missing extents.

Have you been able to extract vmdks larger than 512 GB ?


________________________________________________
Do you need support with a VMFS recovery problem ? - send a message via skype "sanbarrow"
I do not support Workstation 16 at this time ...

Reply
0 Kudos
Charles123987
Contributor
Contributor

My company has a SFTP server where I can upload the vmfs header.  If you can give me your public SSH key I can open access for you.

My boss doesn't want me to post the file to a public upload hosting site.

Reply
0 Kudos
Charles123987
Contributor
Contributor

I have uploaded it to the following address:

sftp://4vmware@sftp.crossix.com/from-crossix///4vmware@sftp.crossix.com/from-crossix/

Just send me your public SSH key and I'll grant you access.

Reply
0 Kudos
continuum
Immortal
Immortal

Run this command against your dumpfile:

dd if=dumpfile bs=1M count=2 | strings -t d > r.txt

This will look similar to this:

1048622 vmhba1:0:0

1048706 7       J\

1052692 5c4a0937-06866605-29c5-000c29bade33

1052756 7       J\

1568768 mpx.vmhba1:C0:T0:L0:3

1569024 mpx.vmhba1:C0:T4:L0:1

The red part specifies the "parent VMFS volume"

The blue part specifies the first extent - and in your case I expect to see 2 extents.


________________________________________________
Do you need support with a VMFS recovery problem ? - send a message via skype "sanbarrow"
I do not support Workstation 16 at this time ...

Reply
0 Kudos
Charles123987
Contributor
Contributor

I had to go higher than count=2 to show the values:

1048638 PERC H

1049108 53fc9636-a8c59864-7a6a-c81f66b9a264

1565184 naa.690b11c0022240001b8f4049062a3bdc:3

19922973 datastore1

21250102 192.168.15.105

21250614 192.168.15.105

22153777 XyV8

23068964 .fbb.sf

At that point it is showing directory entries from the root directory of the datastore.

Looks like only one extent to me.

Reply
0 Kudos
continuum
Immortal
Immortal

Ok - that result contradicts

esxcfg-volume -l

VMFS UUID/label: 53fc9638-dab159ce-4eea-c81f66b9a264/datastore1

Can mount: No (some extents missing)

Can resignature: No (some extents missing)

But according to that vmfs-fuse later than version 5.1 should be able to extract your vmdks.
vmfs-tools 5.1 will extract vmdks up to 256 gb


________________________________________________
Do you need support with a VMFS recovery problem ? - send a message via skype "sanbarrow"
I do not support Workstation 16 at this time ...

Reply
0 Kudos
Charles123987
Contributor
Contributor

Any 'fairly' simple way to force it to mount inside ESXi ?  It would make things a LOT easier if I could bring up the VM for a few hours instead of copying 17TB of .vmdk files.

If not, where can I download the most recent vmfs-tools? https://glandium.org/projects/vmfs-tools/  ?  Can I use the most recent vmfs-tools with Centos, or should I use a different Live-USB?

Reply
0 Kudos
continuum
Immortal
Immortal

> Any 'fairly' simple way to force it to mount inside ESXi ?  It would make things a LOT easier if I could bring up the VM for a few hours instead of copying 17TB of .vmdk files.
Apparently your ESXi finds something that prevents it from mounting. If resignature does not work then there is no way that I am aware of.

> where can I download the most recent vmfs-tools?

Sorry - I dont know.

I found an experimental version some time ago but dont remember where.

You can use my livecd http://sanbarrow.com/livecds/moa64-nogui/MOA64-nogui-incl-src-111014-efi.iso

It has 5.1 installed but you can add a newer version - which can handle vmdks larger than 256 gb.

Call me via skype and I will talk you through the process.


________________________________________________
Do you need support with a VMFS recovery problem ? - send a message via skype "sanbarrow"
I do not support Workstation 16 at this time ...

Reply
0 Kudos
Charles123987
Contributor
Contributor

For reference:

The vmfs-tools package I had installed did not support the files larger than 256GB, so I eventually needed to download

https://github.com/mlsorensen/vmfs-tools/archive/master.zip  and compile that :

note, this compile needed a "yum install fuse-devel uuid-devel uuid-dce-devel libuuid-devel"

Afterwards, I was able to mount the partition with:

vmfs-fuse -r -o allow_other /dev/sdb3 /vmfs

and then the data was visible with:

guestmount -r -o allow_other -a /vmfs/vnas/vnas-flat.vmdk -a /vmfs/vnas/vnas_1-flat.vmdk -a /vmfs/vnas/vnas_2-flat.vmdk -a /vmfs/vnas/vnas_3-flat.vmdk -a /vmfs/vnas/vnas_4-flat.vmdk -a /vmfs/vnas/vnas_5-flat.vmdk -a /vmfs/vnas/vnas_6-flat.vmdk -m /dev/vg_vnas/lv_data /mnt

Reply
0 Kudos