I had a disk fail yesterday, and after the rebuild I noticed one of my datastores was missing. I see the device but not the associated datastore.
|/vmfs/devices/disks/eui.30f9584100d00000:1||59cd3f06-381b631f-8b26-002590707790 0 raid|
I am running vSphere 6.5
When I run VOMA I get the following output:
[root@ESX-1:/var/log] voma -m vmfs -f check -d /vmfs/devices/disks/eui.30f9584100d00000
Checking if device is actively used by other hosts
Running VMFS Checker version 2.1 in check mode
Initializing LVM metadata, Basic Checks will be done
Phase 1: Checking VMFS header and resource files
Detected VMFS file system (labeled:'raid') with UUID:59cd3f06-381b631f-8b26-002590707790, Version 5:81
ON-DISK ERROR: corrupted SFD address <FB c14268 r687> for resource file PB2
ERROR: Failed to check pb2.sf.
VOMA failed to check device : Severe corruption detected
Total Errors Found: 1
Kindly Consult VMware Support for further assistance
Is there anyway to do a repair on the device in question?
Thanks for any help
It looks like that you have broken metadata which VOMA can't fix.
Do you see any storage errors in vmkernel.log for the device with missing VMFS?
I see these errors. I'm not understanding why the adaptec driver is dumping core:
2018-03-11T14:30:09.169Z cpu1:68070 opID=e755d70d)WARNING: Fil3: 1373: Failed to reserve volume f530 28 1 59cd3f06 381b631f 25008b26 90777090 0 0 0 0 0 0 0
2018-03-11T14:30:09.169Z cpu1:68070 opID=e755d70d)Vol3: 3091: Failed to get object 28 type 2 uuid 59cd3f06-381b631f-8b26-002590707790 FD 4 gen 1 :Not found
2018-03-11T14:30:09.212Z cpu1:68070 opID=e755d70d)Vol3: 3091: Failed to get object 28 type 1 uuid 59cd3f06-381b631f-8b26-002590707790 FD 0 gen 0 :Not found
2018-03-11T14:30:09.212Z cpu1:68070 opID=e755d70d)WARNING: Fil3: 1373: Failed to reserve volume f530 28 1 59cd3f06 381b631f 25008b26 90777090 0 0 0 0 0 0 0
2018-03-11T14:30:09.213Z cpu1:68070 opID=e755d70d)Vol3: 3091: Failed to get object 28 type 2 uuid 59cd3f06-381b631f-8b26-002590707790 FD 4 gen 1 :Not found
2018-03-11T14:31:58.248Z cpu11:68726)Resv: 407: Executed out-of-band reserve on eui.30f9584100d00000
2018-03-11T14:31:58.250Z cpu11:68726)Resv: 407: Executed out-of-band release on eui.30f9584100d00000
2018-03-11T14:32:58.095Z cpu4:68736)Resv: 407: Executed out-of-band reserve on eui.30f9584100d00000
2018-03-11T14:32:58.097Z cpu4:68736)Resv: 407: Executed out-of-band release on eui.30f9584100d00000
2018-03-11T14:57:06.456Z cpu2:68121)User: 3089: sfcb-adaptec_st: wantCoreDump:sfcb-adaptec_st signal:11 exitCode:0 coredump:enabled
2018-03-11T14:57:06.569Z cpu2:68121)UserDump: 3024: sfcb-adaptec_st: Dumping cartel 68090 (from world 68121) to file /var/core/sfcb-adaptec_st-zdump.001 ...
2018-03-11T14:57:13.040Z cpu20:68121)UserDump: 3172: sfcb-adaptec_st: Userworld(sfcb-adaptec_st) coredump complete.
> Failed to check pb2.sf.
This is a very serious error. The VMFS-metadata appears to be corrupt.
Please read my instructions here:
If you give me a download link for your dump-file I will look into it.
Your dumpfile is incomplete - it should be either an archive or if you did not compress it - then the file should be 1536 MB but it is only 256 mb.
Please try again or call me so that we can do it together.
Sorry about that looks like /tmp didn't have the space so the file was truncated
Unfortunately the VMFS-volume really is seriously damaged. A repair looks impossible.
To add to that most of your VMDKs are thin-provisioned.
Given the facts that you used too many thin provisioned vmdks and that you are probably using ESXi 6 or maybe even 6.5 the chance to recover your data is not that good.
Eventually recoverable flat vmdks are:
"Solutions Integration Service-flat.vmdk"
I suggest you call me via skype
Good afternoon. I realize I'm digging up a pretty old thread, here, but I'm wondering where this landed. I've encountered the exact same error. We lost one drive in a three disk RAID 5, and for whatever reason the P400 RAID controller disabled the volume. We were able to replace the drive and get the RAID to rebuild, but the VMFS file system still won't mount. We're seeing that same:
"ON-DISK ERROR: corrupted SFD address <INVALID c0 r0> for resource file PB2." Any chance anyone has any ideas?