VMware Cloud Community
ziffusion
Contributor
Contributor

Datastore disappeared after disk replacement in a RAID 1 array (ESXi 6.7)

I used to have a datastore defined across 2 disks (spanned 2 partitions on 2 disks). After some disk maintenance the datastore disappeared.

The 2 vmfs partitions on 2 disks seem to still be there:

naa.6842b2b07523ad00266e747a07b4f40c:3

naa.6842b2b07523ad0022b0972b1b2d9380:1

Here are a couple of things I see in the logs:

2020-06-07T01:18:16.558Z cpu3:2097803)ScsiDeviceIO: 3015: Cmd(0x459a6a755f80) 0x1a, CmdSN 0x1944 from world 0 to dev "naa.6842b2b07523ad00266e747a07b4f40c" failed H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x24 0x0.

2020-06-07T01:18:16.558Z cpu3:2102696)LVM: 11279: Device naa.6842b2b07523ad00266e747a07b4f40c:3 detected to be a snapshot:

2020-06-07T01:18:16.558Z cpu3:2102696)LVM: 11286:   queried disk ID: <type 2, len 22, lun 0, devType 0, scsi 0, h(id) 9838619252749419043>

2020-06-07T01:18:16.558Z cpu3:2102696)LVM: 11293:   on-disk disk ID: <type 2, len 22, lun 0, devType 0, scsi 0, h(id) 12515120875663827942>

2020-06-07T01:18:17.051Z cpu11:2097804)ScsiDeviceIO: 3015: Cmd(0x45a2403d9480) 0x1a, CmdSN 0x19f4 from world 0 to dev "naa.6842b2b07523ad00266e747a07b4f40c" failed H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x24 0x0.

2020-06-07T01:18:17.096Z cpu11:2097804)ScsiDeviceIO: 3015: Cmd(0x45a240214b40) 0x1a, CmdSN 0x1a09 from world 0 to dev "naa.6842b2b07523ad0022b0972b1b2d9380" failed H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x24 0x0.

2020-06-07T01:18:17.103Z cpu12:2098768)LVM: 14889: Extent zero missing

Output of command esxcfg-volume -l:

VMFS UUID/label: 5b1f0184-de3a7854-541e-842b2b5ad05e/datastore1

Can mount: No (some extents missing)

Can resignature: No (some extents missing)

Extent name: naa.6842b2b07523ad00266e747a07b4f40c:3        range: 0 - 277759 (MB)

Any help will be much appreciated.

10 Replies
Nawals
Expert
Expert

Hi,

Here is VMware KB for similar issue, If you are using VSAN. VMware Knowledge Base

If not using VSAN follow below VMware KB VMware Knowledge Base

NKS Please Mark Helpful/correct if my answer resolve your query.
Reply
0 Kudos
Ardaneh
Enthusiast
Enthusiast

Hi

When the ESXi host does not confirm the identity of the LUN with what it expects to see in the VMFS metadata, the snapshot LUNs issue occurs. This issue will occur after replacing SAN hardware or firmware.

To solve the problem, you can add the datastore by using vSphere client. To do this:

In the Hardware panel of Configuration tab, click Storage and then click Add Storage. Select the Disk/LUN storage type and click Next. From the list of LUNs, select the LUN that has a datastore name displayed in the VMFS Label column, then click Next.

Under Mount Options, these options are displayed:

Keep Existing Signature: Persistently mount the LUN (for example, mount LUN across reboots)

Assign a New Signature: Resignature the LUN

Format the disk: Reformat the LUN

You should know that format the disk will delete any existing data on the LUN, so select the proper option

You can check this article VMware Knowledge Base

Hope this could be helpful

Reply
0 Kudos
ziffusion
Contributor
Contributor

I get the following errors when doing what is recommended in the article.

esxcli storage vmfs snapshot mount -l datastore1

Unable to mount this VMFS volume due to some extents missing

esxcfg-volume -m datastore1

Mounting volume datastore1

Error: Unable to mount this VMFS volume due to some extents missing

Reply
0 Kudos
subinchungath
Contributor
Contributor

Dear Ziffusion ,

Did you got a chance to go through the VMware Knowledge Base

One of the reason could be vSphere handling of LUNs detected as snapshot LUNs .

-Regards

Reply
0 Kudos
ziffusion
Contributor
Contributor

subinchungath

Yes I did go through  that. It's a very similar problem. The suggested solution there is to mount the volume using exicli and resignature.

But in may case there is the additional twist that the volume is NOT mounatble, NOR resignable because of a missing extent. See comment above:

esxcli storage vmfs snapshot mount -l datastore1

Unable to mount this VMFS volume due to some extents missing

esxcfg-volume -m datastore1

Mounting volume datastore1

Error: Unable to mount this VMFS volume due to some extents missing

esxcli storage vmfs snapshot list

5b1f0184-de3a7854-541e-842b2b5ad05e

   Volume Name: datastore1

   VMFS UUID: 5b1f0184-de3a7854-541e-842b2b5ad05e

   Can mount: false

   Reason for un-mountability: some extents missing

   Can resignature: false

   Reason for non-resignaturability: some extents missing

   Unresolved Extent Count: 1

Something happened in the process of replacing a disk in the RAID 1 array (see OP).

Reply
0 Kudos
continuum
Immortal
Immortal

Hi

are you sure that the missing extents are still available ?

If yes - call me via skype "sanbarrow"

Ulli


________________________________________________
Do you need support with a VMFS recovery problem ? - send a message via skype "sanbarrow"
I do not support Workstation 16 at this time ...

Reply
0 Kudos
ziffusion
Contributor
Contributor

continuum

Yes, I think both the extents are still available. It's in the OP.

I'll call you on Skype. Is some time better than others? Not sure which time zone you are in.

Sanjay

Reply
0 Kudos
continuum
Immortal
Immortal

You can call between noon and 2 in the night german time.

I will need a 2gb dump of the parent datastore and a 30mb dump of each additional extent.

See Create a VMFS-Header-dump using an ESXi-Host in production | VM-Sickbay


________________________________________________
Do you need support with a VMFS recovery problem ? - send a message via skype "sanbarrow"
I do not support Workstation 16 at this time ...

YvesWaldmann
Contributor
Contributor

Hi,

sorry to bring this up again... I have a similar problem.

After some disk failures on my 6.5 standalone ESXi, and some unplug / replug and reboots I finally miss my datastore3 based on a 1TB disk and extended with another 1TB disk (disks are local RAID1 SSD arrays; Dell PER620 w. H710 mini).

I'm currently restoring failed VMs to another host so there is no "real" problem (apart downtime) BUT from what I read It seems in my case parent and extent have been "swapped" :

when I hexdump I find reference to parent in (what is supposed to be) the extent, and nothing in the "parent" :

[root@xxxx:~] esxcfg-volume --list
VMFS UUID/label: 59e9e7ea-7b1af861-2426-b8ac6f8edb3c/datastore3
Can mount: No (some extents missing)
Can resignature: No (some extents missing)
Extent name: naa.6d4ae5208303e000ff00005105163d92:1 range: 0 - 976127 (MB)

And the "extent" dump :

hexdump -C /dev/disks/naa.6d4ae5208303e000ff00005105163d92:1 | less

00100200 00 00 00 a0 dc 01 00 00 cc 1d 00 00 00 00 00 00 |................|
00100210 01 00 00 00 35 39 65 39 65 37 65 61 2d 34 31 66 |....59e9e7ea-41f|
00100220 38 37 61 62 35 2d 63 32 36 62 2d 62 38 61 63 36 |87ab5-c26b-b8ac6|
00100230 66 38 65 64 62 33 63 00 00 00 00 00 00 00 00 00 |f8edb3c.........|
00100240 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
00100250 00 00 00 00 ea e7 e9 59 b5 7a f8 41 6b c2 b8 ac |.......Y.z.Ak...|
00100260 6f 8e db 3c 01 00 00 00 4b 6c fc 5e f9 5b 05 00 |o..<....Kl.^.[..|
00100270 00 00 00 00 e5 0e 00 00 00 00 00 00 00 00 00 00 |................|
00100280 e4 0e 00 00 00 00 00 00 08 d1 c3 a5 bb cc 05 00 |................|
00100290 02 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
001002a0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
0017e200 6e 61 61 2e 36 64 34 61 65 35 32 30 38 33 30 33 |naa.6d4ae5208303|
0017e210 65 30 30 30 32 31 37 63 61 34 31 62 35 30 61 63 |e000217ca41b50ac|
0017e220 35 66 32 63 3a 31 00 00 00 00 00 00 00 00 00 00 |5f2c:1..........|
0017e230 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
0017e300 6e 61 61 2e 36 64 34 61 65 35 32 30 38 33 30 33 |naa.6d4ae5208303|
0017e310 65 30 30 30 32 38 65 31 62 63 33 31 63 39 64 33 |e00028e1bc31c9d3|
0017e320 62 35 64 61 3a 31 00 00 00 00 00 00 00 00 00 00 |b5da:1..........|
0017e330 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|

Si I'm curious @continuum if something can be done...

Regards

 

 

 

Reply
0 Kudos
YvesWaldmann
Contributor
Contributor

In addition

I have non-production VM I want to get back so I have to investigate further.

After many readings and checks there is no "swap" between parent & extent.

But looking @headers I came to the Idea naa.6d4ae5208303e000ff00005105163d92 should be in fact naa.6d4ae5208303e000217ca41b50ac5f2c (ID Before the extent ID, on a non extended volume this "field" is equal to nxx.xxxxxx) 

Reply
0 Kudos