Hi All,
After a Raid5 failure which was solved by Dell Pro Support we cannot mount our main Datastore 1.
See below what we get when doing a : DF -k command, a Fdisk -l command and a sxcfg-volume -l .
Can anybody help out to fix this ?
Thanks a lot, Edwin,
~ # df -k
Filesystem 1k-blocks Used Available Use% Mounted on
VMFS-3 65798144 574464 65223680 1% /vmfs/volumes/datastore2
vfat 4192960 15616 4177344 0% /vmfs/volumes/4f45dd58-7aa89a02-d3d7-00137236b9e3
vfat 255716 135796 119920 53% /vmfs/volumes/Hypervisor1
vfat 255716 134740 120976 53% /vmfs/volumes/Hypervisor2
vfat 292752 182992 109760 63% /vmfs/volumes/Hypervisor3
~ # fdisk -l
Disk /dev/disks/naa.600188b039df110019587854044a724c: 898.3 GB, 898319253504 bytes
255 heads, 63 sectors/track, 109214 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Device Boot Start End Blocks Id System
/dev/disks/naa.600188b039df110019587854044a724cp1 1 109215 877263872 fb VMFS
Disk /dev/disks/naa.600188b039df110013ec689beb15d5b4: 72.7 GB, 72746008576 bytes
64 heads, 32 sectors/track, 69376 cylinders
Units = cylinders of 2048 * 512 = 1048576 bytes
Device Boot Start End Blocks Id System
/dev/disks/naa.600188b039df110013ec689beb15d5b4p1 5 900 917504 5 Extended
/dev/disks/naa.600188b039df110013ec689beb15d5b4p2 901 4995 4193280 6 FAT16
/dev/disks/naa.600188b039df110013ec689beb15d5b4p3 4996 69376 65926144 fb VMFS
/dev/disks/naa.600188b039df110013ec689beb15d5b4p4 * 1 4 4080 4 FAT16 <32M
/dev/disks/naa.600188b039df110013ec689beb15d5b4p5 5 254 255984 6 FAT16
/dev/disks/naa.600188b039df110013ec689beb15d5b4p6 255 504 255984 6 FAT16
/dev/disks/naa.600188b039df110013ec689beb15d5b4p7 505 614 112624 fc VMKcore
/dev/disks/naa.600188b039df110013ec689beb15d5b4p8 615 900 292848 6 FAT16
~ # esxcfg-volume -l
VMFS UUID/label: 4c59cf5e-aa685370-125c-00137236b9e5/datastore1
Can mount: No (some extents missing)
Can resignature: No (some extents missing)
Extent name: naa.600188b039df110019587854044a724c:1 range: 0 - 129023 (MB)
Extent name: naa.600188b039df110019587854044a724c:1 range: 130048 - 131071 (MB)
Extent name: naa.600188b039df110019587854044a724c:1 range: 655360 - 783359 (MB)
Extent name: naa.600188b039df110019587854044a724c:1 range: 783360 - 856575 (MB)
You may want to investigate to see if this KB applies. Do you still see the lun? Were you using extents on the datastore?
VMware KB: Recovering a lost partition table on a VMFS volume
Can you provide the full fdisk -lu output? I don't see the output for datastore1
also esxcfg-scsidevs -c
~ # fdisk -lu
Disk /dev/disks/naa.600188b039df110019587854044a724c: 898.3 GB, 898319253504 bytes
255 heads, 63 sectors/track, 109214 cylinders, total 1754529792 sectors
Units = sectors of 1 * 512 = 512 bytes
Device Boot Start End Blocks Id System
/dev/disks/naa.600188b039df110019587854044a724cp1 2048 1754529791 877263872 fb VMFS
Disk /dev/disks/naa.600188b039df110013ec689beb15d5b4: 72.7 GB, 72746008576 bytes
64 heads, 32 sectors/track, 69376 cylinders, total 142082048 sectors
Units = sectors of 1 * 512 = 512 bytes
Device Boot Start End Blocks Id System
/dev/disks/naa.600188b039df110013ec689beb15d5b4p1 8192 1843199 917504 5 Extended
/dev/disks/naa.600188b039df110013ec689beb15d5b4p2 1843200 10229759 4193280 6 FAT16
/dev/disks/naa.600188b039df110013ec689beb15d5b4p3 10229760 142082047 65926144 fb VMFS
/dev/disks/naa.600188b039df110013ec689beb15d5b4p4 * 32 8191 4080 4 FAT16 <32M
/dev/disks/naa.600188b039df110013ec689beb15d5b4p5 8224 520191 255984 6 FAT16
/dev/disks/naa.600188b039df110013ec689beb15d5b4p6 520224 1032191 255984 6 FAT16
/dev/disks/naa.600188b039df110013ec689beb15d5b4p7 1032224 1257471 112624 fc VMKcore
/dev/disks/naa.600188b039df110013ec689beb15d5b4p8 1257504 1843199 292848 6 FAT16
Partition table entries are not in disk order
~ #
~ # esxcfg-scsidevs -c
Device UID Device Type Console Device Size Multipath PluginDisplay Name
mpx.vmhba0:C0:T0:L0 CD-ROM /vmfs/devices/cdrom/mpx.vmhba0:C0:T0:L0 0MB NMP Local PHILIPS CD-ROM (mpx.vmhba0:C0:T0:L0)
mpx.vmhba32:C0:T0:L0 CD-ROM /vmfs/devices/cdrom/mpx.vmhba32:C0:T0:L0 0MB NMP Local USB CD-ROM (mpx.vmhba32:C0:T0:L0)
naa.600188b039df110013ec689beb15d5b4 Direct-Access /vmfs/devices/disks/naa.600188b039df110013ec689beb15d5b4 69376MB NMP Local DELL Disk (naa.600188b039df110013ec689beb15d5b4)
naa.600188b039df110019587854044a724c Direct-Access /vmfs/devices/disks/naa.600188b039df110019587854044a724c 856704MB NMP Local DELL Disk (naa.600188b039df110019587854044a724c)
t10.DP______BACKPLANE000000 Enclosure Svc Dev/vmfs/devices/genscsi/t10.DP______BACKPLANE000000 0MB NMP Local DP Enclosure Svc Dev (t10.DP______BACKPLANE000000)
~ #
I will check this and get back to you.
Although I did not setup this datastore, its' VMFS3 and under 2TB so I assume no extents were used.
Also the datastore was always exactly 3x disk size ( and 1 parity ), never increased or anything.
We created new datastores on NFS instead.
/dev/disks/naa.600188b039df110019587854044a724cp1 1 109215 877263872 fb VMFS
Shows extents missing
Extent name: naa.600188b039df110019587854044a724c:1 range: 0 - 129023 (MB)
Extent name: naa.600188b039df110019587854044a724c:1 range: 130048 - 131071 (MB)
Extent name: naa.600188b039df110019587854044a724c:1 range: 655360 - 783359 (MB)
Extent name: naa.600188b039df110019587854044a724c:1 range: 783360 - 856575 (MB)
Was this datastore just one partition or was it extended with different RAID members?
All,
I have also updated the post with the vmkernel and storagerm logs.
Edwin,
No it was not extended over multiple raid sets.
This was only 1 RAID5 set consisting of 4 disks.
I have the feeling that I lost some control/partition/reference tables or so and that data is still there.
That's what Dell thinks but we are off course not sure.
Thanks for the logs
2013-06-22T17:21:21.026Z: No vmfs datastores found for naa.600188b039df110019587854044a724c
What if you try recreating the partition table on the device and check?
Use fdisk if vmfs3
use partedutil if vmfs5
Please open a support ticket with VMware if you are unfamiliar with the commands as directly modifying the vmfs filesystem can have unrecoverable consequences if not done correctly
how to verify for 100% which vmfs is/was used ? 3 or 5 ?
We are running Esxi 5.0 now.
I know we upgraded at one point from 4.x that was done by a former collegue.
And can we open a support ticket with VMware ?
We are using free version of VMware ESXi.
The existing volumes are vmfs 3 as they are being detected by fdisk. As to support you can call up the helpdesk number for your country and check.
I am guessing this is going to be paid support. You could try the first link I sent you as that would be valid for vmfs 3.x though it mentions the ESX version to be till 4.x
If I use PartedUtil I see that its a 251 on line 1 which according to your post: VMware KB: Using the partedUtil command line utility on ESXi and ESX
is a DataStore. That gives me some hope.
/var/log # partedUtil get /vmfs/devices/disks/naa.600188b039df110019587854044a724c
109214 255 63 1754529792
1 2048 1754529791 251 0
The following is what I would like to execute please confirm this is ok :
fdisk -u /vmfs/devices/disks/naa.600188b039df110019587854044a724c
Create the partition:
Press n and press Enter to create a new partition.
Press p and press Enter to select that this is a primary partition.
Press 1 and press Enter to make the first partition.
Press 128 and Enter to align the partition to sector 128.
Press Enter again to retain the default value.
Change the partition to type fb (VMFS):
Press t and press Enter. Partition 1 is automatically selected.
Enter fb and press Enter.
Press w and press Enter to save.
Run vmkfstools -V and press Enter to discover the VMFS.
******************
What I still don't understand is what is going wrong here during the mount.
1. We have a live physical disk.
2. It shows formated and with a partition VMFS3.
3. It shows itself under Configuration/Storage Adapters ( the SCSI storage device ).
Why doesn't it detect the datastore. Does it think that we do not have the extents ? And therefore fails ?
And if so, is my action planned as per above likely to solve the problem.
Just trying to get the right approach here.
Before that you might have to run fdisk -d to delete the existing partition table! (This will not format the LUN though, just remove the partition)
Rest of the command looks correct
What I still don't understand is what is going wrong here during the mount.
1. We have a live physical disk.
2. It shows formated and with a partition VMFS3.
3. It shows itself under Configuration/Storage Adapters ( the SCSI storage device ).
Why doesn't it detect the datastore. Does it think that we do not have the extents ? And therefore fails ?
And if so, is my action planned as per above likely to solve the problem.
Just trying to get the right approach here.
That just about sums it up!
I performed the actions , and ... too bad no luck I guess.
This is the vmkernel.log output during vmktools -V command
2013-06-23T20:20:44.356Z cpu3:2051)NMP: nmp_ThrottleLogForDevice:2318: Cmd 0x1a (0x412400783780) to dev "naa.600188b039df110019587854044a724c" on path "vmhba1:C2:T1:L0" Failed: H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x20 0x0.Act:NONE
2013-06-23T20:20:44.356Z cpu3:2051)ScsiDeviceIO: 2316: Cmd(0x412400783780) 0x1a, CmdSN 0x1c69 to dev "naa.600188b039df110019587854044a724c" failed H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x20 0x0.
2013-06-23T20:20:56.059Z cpu4:32407)<3>ata1.00: bad CDB len=16, scsi_op=0x9e, max=12
2013-06-23T20:20:56.066Z cpu7:2055)NMP: nmp_ThrottleLogForDevice:2318: Cmd 0x1a (0x412400723840) to dev "mpx.vmhba0:C0:T0:L0" on path "vmhba0:C0:T0:L0" Failed: H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x20 0x0.Act:NONE
2013-06-23T20:20:56.066Z cpu7:2055)ScsiDeviceIO: 2316: Cmd(0x412400723840) 0x1a, CmdSN 0x1c7d to dev "mpx.vmhba0:C0:T0:L0" failed H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x20 0x0.
2013-06-23T20:20:56.094Z cpu7:2055)NMP: nmp_ThrottleLogForDevice:2318: Cmd 0x1a (0x412400723840) to dev "t10.DP______BACKPLANE000000" on path "vmhba1:C0:T8:L0" Failed: H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x20 0x0.Act:NONE
2013-06-23T20:20:56.094Z cpu7:2055)ScsiDeviceIO: 2316: Cmd(0x412400723840) 0x1a, CmdSN 0x1c7e to dev "t10.DP______BACKPLANE000000" failed H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x20 0x0.
2013-06-23T20:20:56.111Z cpu4:32407)<3>ata1.00: bad CDB len=16, scsi_op=0x9e, max=12
2013-06-23T20:20:56.174Z cpu4:32407)FSS: 4333: No FS driver claimed device 'mpx.vmhba32:C0:T0:L0': Not supported
2013-06-23T20:20:56.185Z cpu4:32407)FSS: 4333: No FS driver claimed device 'naa.600188b039df110019587854044a724c:1': Not supported
2013-06-23T20:20:56.185Z cpu4:32407)Vol3: 647: Couldn't read volume header from control: Invalid handle
2013-06-23T20:20:56.185Z cpu4:32407)FSS: 4333: No FS driver claimed device 'control': Not supported
2013-06-23T20:20:56.192Z cpu4:32407)<3>ata1.00: bad CDB len=16, scsi_op=0x9e, max=12
2013-06-23T20:20:56.195Z cpu4:32407)<3>ata1.00: bad CDB len=16, scsi_op=0x9e, max=12
2013-06-23T20:20:56.197Z cpu4:32407)<3>ata1.00: bad CDB len=16, scsi_op=0x9e, max=12
2013-06-23T20:20:56.199Z cpu4:32407)<3>ata1.00: bad CDB len=16, scsi_op=0x9e, max=12
2013-06-23T20:20:56.204Z cpu4:32407)FSS: 4333: No FS driver claimed device 'mpx.vmhba0:C0:T0:L0': Not supported
2013-06-23T20:20:56.220Z cpu4:32407)VC: 1449: Device rescan time 77 msec (total number of devices 9)
2013-06-23T20:20:56.220Z cpu4:32407)VC: 1452: Filesystem probe time 105 msec (devices probed 8 of 9)
2013-06-23T20:21:43.260Z cpu4:32439)User: 2368: wantCoreDump : sfcb-hhrc -enabled : 0
I used Diskinternals VMFS Recovery and I can still see data on the disk present. But It does not show me VMDK's but loose files.
So not all is lost. I need to find a way to get the full VMDK's of there.
Reboot the host once and check?
H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x20 0x0
Did you look at the datastore with Linux vmfs-fuse ?
In my experience that has a chance as good as the Diskinternals tool - sometimes even better - and its free
You could also try out UFS Explorer - http://www.ufsexplorer.com/index.php It's not free but you can burn a live CD and test it out for free to see if it will fix your problem.