VMware Cloud Community
bsoima
Contributor
Contributor

Cannot mount or resignature VMFS extent after the disk ID changed.

I had a 950 GB vmfs file-system made up by 2 vmfs partitions on two LUNs (I made an extent some time ago).

So the first partition is a 544 GB , and the extent is 408 GB.

Seems the problem took place after an update to vsphere 4 update 1, but I can't be sure why the disk id's changed.

Only the extent h(id) changed and now I can only access the data/virtual machines stored in the first partition.

I also get the following errors :

tail -f /var/log/vmkernel

Nov 29 08:50:25 vmx1 vmkernel: 0:07:06:48.668 cpu4:4111)ALERT: LVM: 2054: One or more devices not found (file system StoreVMFS, 47b454b6-89ec4dc8-5a3e-001a6426cdb2)

Nov 29 03:20:35 vmx1 vmkernel: 0:01:36:58.386 cpu7:4109)LVM: 7165: Device naa.600a0b80003a8bee000006bd4aea9229:1 detected to be a snapshot:

After reading through the "detected to be a snapshot" situation I tried running esxcfg-volume -l with the following output

VMFS3 UUID/label: n.a./n.a.

Can mount: No (some extents missing)

Can resignature: No (some extents missing)

Extent name: naa.600a0b80003a8bee000006bd4aea9229:1 range: 557824 - 976127 (MB)

The output from vmkfstools -P /vmfs/volumes/StoreVMFS :

VMFS-3.31 file system spanning 1 partitions.

+File system label (if any): StoreVMFS +

+Mode: public+

+Capacity 1023544393728 (976128 file blocks * 1048576), 61338550272 (58497 blocks) avail+

UUID: 47b454b6-89ec4dc8-5a3e-001a6426cdb2

Partitions spanned (on "lvm"):

naa.600a0b800038b1a10000036b47b28b21:1

(One or more partitions spanned by this volume may be offline)

So as you can see from some reason the extent is not beeing seen correctly.

How can I tell esx to use this LUN as the extent ?

Both LUNs appear when I do fdisk -l command :

Disk /dev/sdb: 438.8 GB, 438831153152 bytes

255 heads, 63 sectors/track, 53351 cylinders

Units = cylinders of 167065 * 512 = 8225280 bytes

Device Boot Start End Blocks Id System

/dev/sdb1 1 53351 428541843 fb VMware VMDisk /+

+dev/sdf: 585.1 GB, 585111699456 bytes+

+255 heads, 63 sectors/track, 71135 cylinders+

+Units = cylinders of 16065 * 512 = 8225280 bytes+

Device Boot Start End Blocks Id System

/dev/sdf1 1 71135 571391823 fb VMware VMFS+

FS

Can I do anything to recover the data from the extent ?

The LUN appears correctly in storage devices.

THANKS !

0 Kudos
14 Replies
bsoima
Contributor
Contributor

esxcfg-info data :

\==+Vm FileSystem :

|----Volume UUID........................................47b454b6-89ec4dc8-5a3e-001a6426cdb2

|----Head Extent........................................naa.600a0b800038b1a10000036b47b28b21:1

|----Console Path......................................./vmfs/volumes/47b454b6-89ec4dc8-5a3e-001a6426cdb2

|----Block Size.........................................1048576

|----Total Blocks.......................................976128

|----Blocks Used........................................917801

|----Size...............................................1023544393728

|----Usage..............................................962384101376

|----Volume Name........................................StoreVMFS

|----Lock Mode..........................................public

|----Major Version......................................3

|----Minor Version......................................31

|----Is Force Mounted...................................false

\==+Extents :

\==+Disk Lun Partition :

|----Name.........................................naa.600a0b800038b1a10000036b47b28b21:1

|----Partition Number.............................1

|----Start Sector.................................128

|----End Sector...................................1142783775

|----Partition Type...............................251

|----Console Device.............................../dev/sdf1

|----DevFS Path.................................../vmfs/devices/disks/naa.600a0b800038b1a10000036b47b28b21:1

|----Size.........................................585105227264

|----Type.........................................0x000000fb

\==+Unresolved VMFS Volumes :

\==+Unresolved VMFS Volume :

|----LVM UUID...........................................47b454b3-97189d78-7481-001a6426cdb2

|----VMFS UUID..........................................

|----VMFS Label.........................................

\==+Unresolved Extents :

\==+Unresolved VMFS Extent :

|----LVM Name.....................................47b454b3-97189d78-7481-001a6426cdb2

|----VMFS UUID....................................

|----Start........................................557824

|----End..........................................976127

|----Index........................................2

\==+Disk Lun Partition :

|----Name......................................naa.600a0b80003a8bee000006bd4aea9229:1

|----Partition Number..........................1

|----Start Sector..............................128

|----End Sector................................857083815

|----Partition Type............................251

|----Console Device............................/dev/sdb1

|----DevFS Path................................/vmfs/devices/disks/naa.600a0b80003a8bee000006bd4aea9229:1

|----Size......................................438826847744

|----Type......................................0x000000fb

I did a reboot on the host and I see the following in the log :

LVM: 7172: queried disk ID: <type 2, len 22, lun 5, devType 0, scsi 0, h(id) 913820410008038874>

LVM: 7179: on-disk disk ID: <type 2, len 22, lun 5, devType 0, scsi 0, h(id) 18102633032534102112>

I also tried to enable regsignature with this command : esxcfg-advcfg -s 0 /LVM/EnableResignature

in vmkernel.log :

LVM: 10154: Extent zero missing

LVM: 10571: Failed to validate arguments

LVM: 9670: Failed resignaturing operation with status: Bad parameter

0 Kudos
binoche
VMware Employee
VMware Employee

vmkfstools -P /vmfs/volumes/StoreVMFS shows naa.600a0b800038b1a10000036b47b28b21:1 (/dev/sdf1) is the 1st extent;

if StoreVMFS has only 2 extents, my guess naa.600a0b80003a8bee000006bd4aea9229:1 (/dev/sdb1) is the 2nd extent, now it is detected to be a snapshot; but esxcfg-volume --resignature can not complete the resignature here,

have you changed something on this naa.600a0b80003a8bee000006bd4aea9229 after an update to vsphere 4 update 1, such as lun number?

if you have not changed anything, I would suggest to downgrade to vsphere 4 and backup StoreVMFS first

binoche, VMware VCP, Cisco CCNA

0 Kudos
bsoima
Contributor
Contributor

I tried the downgrade yesterday but it's the same situation. Smiley Sad The 2nd extent appears with a changed ID.

I don't know why because I didn't modify anything on the SAN storage.

I can't believe that vmware does not allow me to mount this extent just because an ID is different.

0 Kudos
binoche
VMware Employee
VMware Employee

esxcfg-volume -l did not report naa.600a0b80003a8bee000006bd4aea9229:1 with correct VMFS3 UUID/label "VMFS3 UUID/label: n.a./n.a.";

my guess lun ID naa.600a0b80003a8bee000006bd4aea9229 is not changed after the upgrade, right? only lun number changed, could you please revert naa.600a0b80003a8bee000006bd4aea9229 lun number? thanks

binoche, VMware VCP, Cisco CCNA

bsoima
Contributor
Contributor

I looked more carefully in the vmkernel.log on the machine that was not downgraded to Vshpere 4.0

I think that before the mix-up, the LUN had another ID : naa.600a0b80003a8bee000004294861a42e

Now it has naa.600a0b80003a8bee000006bd4aea9229.

The LUN number is the same only the naa ID seems to be changed. Can I revert it back somehow ? Rename it to the old value ?

Thanks

0 Kudos
binoche
VMware Employee
VMware Employee

please revert to the previous id;

and then run the below commands,

esxcfg-rescan -d vmhb # to add naa.600a0b80003a8bee000004294861a42e

vmkfstools -V # to refresh all vmfss

esxcfg-volumes -l # to check naa.600a0b80003a8bee000004294861a42e detected to be snapshot or not

vmkfstools -P -h /vmfs/volumes/StoreVMFS # to check StoreVMFS now has the correct extents or not

binoche, VMware VCP, Cisco CCNA

0 Kudos
bsoima
Contributor
Contributor

Thanks for the info, but I don't know how to revert to the previous ID.That's the problem i guess...

I didn't change the LUN number or other settings in the storage. What else could trigger the naa id change ?

0 Kudos
binoche
VMware Employee
VMware Employee

what storage in use?

is naa.600a0b80003a8bee000006bd4aea9229 really snapshot from your storage, and naa.600a0b80003a8bee000004294861a42e forgot to be present to your host?

can you still find naa.600a0b80003a8bee000004294861a42e on your storage?

binoche, VMware VCP, Cisco CCNA

0 Kudos
bsoima
Contributor
Contributor

It's a IBM DS3400 Fiber-channel storage. There's no snapshot on the storage, there are only the 2 LUNs for the VMFS volume.

It seems the naa id for the 2nd vmfs extent changed for some unknown reason.

Both luns are presented to the vmware hosts.

0 Kudos
binoche
VMware Employee
VMware Employee

could you please upload all /var/log/vmkernel*? I also want to have a recheck, thanks

binoche, VMware VCP, Cisco CCNA

0 Kudos
bsoima
Contributor
Contributor

I did it ! I managed to mount them.

I manually made a snapshot of the 1st VMFS partition and asigned this snapshot and the 2nd partition to the vmware hosts.

They were both detected as snapshots but it mount them :

VMFS-3.31 file system spanning 2 partitions.

File system label (if any): VMFSStore

Mode: public

Capacity 1023544393728 (976128 file blocks * 1048576), 223688523776 (213326 blocks) avail

UUID: 47b454b6-89ec4dc8-5a3e-001a6426cdb2

Partitions spanned (on "lvm"):

+ naa.600a0b800038b1a10000053d4b13194e:1+

+ naa.600a0b80003a8bee000006bd4aea9229:1+

I can access the data now.

I think I will try to do a storage vmotion to a temporary vmfs, re-create the original vmfs and then storage vmotion back.

0 Kudos
binoche
VMware Employee
VMware Employee

Great!

maybe you can think of NOT to use 2nd vmfs extent,

for example just only 1 950 GB lun and if you want to grow its capacity, grow from DS3400 and then let VMFS also grow its extent

binoche, VMware VCP, Cisco CCNA

0 Kudos
bsoima
Contributor
Contributor

I will not use extents next time, I only used them before because the DS3400 storage from IBM doesn't know to expand a LUN (increase the size) once it is formed.

0 Kudos
BA7NQ
Contributor
Contributor

Hi Bosima,

You said "manually made a snapshot of the 1st VMFS partition and asigned this snapshot and the 2nd partition to the vmware hosts", how to manually made a snapshot for a VMFS partition?

Thanks...

Terry

0 Kudos