I had a 950 GB vmfs file-system made up by 2 vmfs partitions on two LUNs (I made an extent some time ago).
So the first partition is a 544 GB , and the extent is 408 GB.
Seems the problem took place after an update to vsphere 4 update 1, but I can't be sure why the disk id's changed.
Only the extent h(id) changed and now I can only access the data/virtual machines stored in the first partition.
I also get the following errors :
tail -f /var/log/vmkernel
Nov 29 08:50:25 vmx1 vmkernel: 0:07:06:48.668 cpu4:4111)ALERT: LVM: 2054: One or more devices not found (file system StoreVMFS, 47b454b6-89ec4dc8-5a3e-001a6426cdb2)
Nov 29 03:20:35 vmx1 vmkernel: 0:01:36:58.386 cpu7:4109)LVM: 7165: Device naa.600a0b80003a8bee000006bd4aea9229:1 detected to be a snapshot:
After reading through the "detected to be a snapshot" situation I tried running esxcfg-volume -l with the following output
VMFS3 UUID/label: n.a./n.a.
Can mount: No (some extents missing)
Can resignature: No (some extents missing)
Extent name: naa.600a0b80003a8bee000006bd4aea9229:1 range: 557824 - 976127 (MB)
The output from vmkfstools -P /vmfs/volumes/StoreVMFS :
VMFS-3.31 file system spanning 1 partitions.
+File system label (if any): StoreVMFS +
+Mode: public+
+Capacity 1023544393728 (976128 file blocks * 1048576), 61338550272 (58497 blocks) avail+
UUID: 47b454b6-89ec4dc8-5a3e-001a6426cdb2
Partitions spanned (on "lvm"):
naa.600a0b800038b1a10000036b47b28b21:1
(One or more partitions spanned by this volume may be offline)
So as you can see from some reason the extent is not beeing seen correctly.
How can I tell esx to use this LUN as the extent ?
Both LUNs appear when I do fdisk -l command :
Disk /dev/sdb: 438.8 GB, 438831153152 bytes
255 heads, 63 sectors/track, 53351 cylinders
Units = cylinders of 167065 * 512 = 8225280 bytes
Device Boot Start End Blocks Id System
/dev/sdb1 1 53351 428541843 fb VMware VMDisk /+
+dev/sdf: 585.1 GB, 585111699456 bytes+
+255 heads, 63 sectors/track, 71135 cylinders+
+Units = cylinders of 16065 * 512 = 8225280 bytes+
Device Boot Start End Blocks Id System
/dev/sdf1 1 71135 571391823 fb VMware VMFS+
FS
Can I do anything to recover the data from the extent ?
The LUN appears correctly in storage devices.
THANKS !
esxcfg-info data :
\==+Vm FileSystem :
|----Volume UUID........................................47b454b6-89ec4dc8-5a3e-001a6426cdb2
|----Head Extent........................................naa.600a0b800038b1a10000036b47b28b21:1
|----Console Path......................................./vmfs/volumes/47b454b6-89ec4dc8-5a3e-001a6426cdb2
|----Block Size.........................................1048576
|----Total Blocks.......................................976128
|----Blocks Used........................................917801
|----Size...............................................1023544393728
|----Usage..............................................962384101376
|----Volume Name........................................StoreVMFS
|----Lock Mode..........................................public
|----Major Version......................................3
|----Minor Version......................................31
|----Is Force Mounted...................................false
\==+Extents :
\==+Disk Lun Partition :
|----Name.........................................naa.600a0b800038b1a10000036b47b28b21:1
|----Partition Number.............................1
|----Start Sector.................................128
|----End Sector...................................1142783775
|----Partition Type...............................251
|----Console Device.............................../dev/sdf1
|----DevFS Path.................................../vmfs/devices/disks/naa.600a0b800038b1a10000036b47b28b21:1
|----Size.........................................585105227264
|----Type.........................................0x000000fb
\==+Unresolved VMFS Volumes :
\==+Unresolved VMFS Volume :
|----LVM UUID...........................................47b454b3-97189d78-7481-001a6426cdb2
|----VMFS UUID..........................................
|----VMFS Label.........................................
\==+Unresolved Extents :
\==+Unresolved VMFS Extent :
|----LVM Name.....................................47b454b3-97189d78-7481-001a6426cdb2
|----VMFS UUID....................................
|----Start........................................557824
|----End..........................................976127
|----Index........................................2
\==+Disk Lun Partition :
|----Name......................................naa.600a0b80003a8bee000006bd4aea9229:1
|----Partition Number..........................1
|----Start Sector..............................128
|----End Sector................................857083815
|----Partition Type............................251
|----Console Device............................/dev/sdb1
|----DevFS Path................................/vmfs/devices/disks/naa.600a0b80003a8bee000006bd4aea9229:1
|----Size......................................438826847744
|----Type......................................0x000000fb
I did a reboot on the host and I see the following in the log :
LVM: 7172: queried disk ID: <type 2, len 22, lun 5, devType 0, scsi 0, h(id) 913820410008038874>
LVM: 7179: on-disk disk ID: <type 2, len 22, lun 5, devType 0, scsi 0, h(id) 18102633032534102112>
I also tried to enable regsignature with this command : esxcfg-advcfg -s 0 /LVM/EnableResignature
in vmkernel.log :
LVM: 10154: Extent zero missing
LVM: 10571: Failed to validate arguments
LVM: 9670: Failed resignaturing operation with status: Bad parameter
vmkfstools -P /vmfs/volumes/StoreVMFS shows naa.600a0b800038b1a10000036b47b28b21:1 (/dev/sdf1) is the 1st extent;
if StoreVMFS has only 2 extents, my guess naa.600a0b80003a8bee000006bd4aea9229:1 (/dev/sdb1) is the 2nd extent, now it is detected to be a snapshot; but esxcfg-volume --resignature can not complete the resignature here,
have you changed something on this naa.600a0b80003a8bee000006bd4aea9229 after an update to vsphere 4 update 1, such as lun number?
if you have not changed anything, I would suggest to downgrade to vsphere 4 and backup StoreVMFS first
binoche, VMware VCP, Cisco CCNA
I tried the downgrade yesterday but it's the same situation. The 2nd extent appears with a changed ID.
I don't know why because I didn't modify anything on the SAN storage.
I can't believe that vmware does not allow me to mount this extent just because an ID is different.
esxcfg-volume -l did not report naa.600a0b80003a8bee000006bd4aea9229:1 with correct VMFS3 UUID/label "VMFS3 UUID/label: n.a./n.a.";
my guess lun ID naa.600a0b80003a8bee000006bd4aea9229 is not changed after the upgrade, right? only lun number changed, could you please revert naa.600a0b80003a8bee000006bd4aea9229 lun number? thanks
binoche, VMware VCP, Cisco CCNA
I looked more carefully in the vmkernel.log on the machine that was not downgraded to Vshpere 4.0
I think that before the mix-up, the LUN had another ID : naa.600a0b80003a8bee000004294861a42e
Now it has naa.600a0b80003a8bee000006bd4aea9229.
The LUN number is the same only the naa ID seems to be changed. Can I revert it back somehow ? Rename it to the old value ?
Thanks
please revert to the previous id;
and then run the below commands,
esxcfg-rescan -d vmhb # to add naa.600a0b80003a8bee000004294861a42e
vmkfstools -V # to refresh all vmfss
esxcfg-volumes -l # to check naa.600a0b80003a8bee000004294861a42e detected to be snapshot or not
vmkfstools -P -h /vmfs/volumes/StoreVMFS # to check StoreVMFS now has the correct extents or not
binoche, VMware VCP, Cisco CCNA
Thanks for the info, but I don't know how to revert to the previous ID.That's the problem i guess...
I didn't change the LUN number or other settings in the storage. What else could trigger the naa id change ?
what storage in use?
is naa.600a0b80003a8bee000006bd4aea9229 really snapshot from your storage, and naa.600a0b80003a8bee000004294861a42e forgot to be present to your host?
can you still find naa.600a0b80003a8bee000004294861a42e on your storage?
binoche, VMware VCP, Cisco CCNA
It's a IBM DS3400 Fiber-channel storage. There's no snapshot on the storage, there are only the 2 LUNs for the VMFS volume.
It seems the naa id for the 2nd vmfs extent changed for some unknown reason.
Both luns are presented to the vmware hosts.
could you please upload all /var/log/vmkernel*? I also want to have a recheck, thanks
binoche, VMware VCP, Cisco CCNA
I did it ! I managed to mount them.
I manually made a snapshot of the 1st VMFS partition and asigned this snapshot and the 2nd partition to the vmware hosts.
They were both detected as snapshots but it mount them :
VMFS-3.31 file system spanning 2 partitions.
File system label (if any): VMFSStore
Mode: public
Capacity 1023544393728 (976128 file blocks * 1048576), 223688523776 (213326 blocks) avail
UUID: 47b454b6-89ec4dc8-5a3e-001a6426cdb2
Partitions spanned (on "lvm"):
+ naa.600a0b800038b1a10000053d4b13194e:1+
+ naa.600a0b80003a8bee000006bd4aea9229:1+
I can access the data now.
I think I will try to do a storage vmotion to a temporary vmfs, re-create the original vmfs and then storage vmotion back.
Great!
maybe you can think of NOT to use 2nd vmfs extent,
for example just only 1 950 GB lun and if you want to grow its capacity, grow from DS3400 and then let VMFS also grow its extent
binoche, VMware VCP, Cisco CCNA
I will not use extents next time, I only used them before because the DS3400 storage from IBM doesn't know to expand a LUN (increase the size) once it is formed.
Hi Bosima,
You said "manually made a snapshot of the 1st VMFS partition and asigned this snapshot and the 2nd partition to the vmware hosts", how to manually made a snapshot for a VMFS partition?
Thanks...
Terry