VMware Cloud Community
bct0109
Contributor
Contributor

VOMA failed to check device: Severe corruption detected

I had a disk fail yesterday, and after the rebuild I noticed one of my datastores was missing.  I see the device but not the associated datastore.

/vmfs/devices/disks/eui.30f9584100d00000:1                                                         59cd3f06-381b631f-8b26-002590707790  0  raid

I am running vSphere 6.5

When I run VOMA I get the following output:

[root@ESX-1:/var/log] voma -m vmfs -f check -d /vmfs/devices/disks/eui.30f9584100d00000

Checking if device is actively used by other hosts

Running VMFS Checker version 2.1 in check mode

Initializing LVM metadata, Basic Checks will be done

Phase 1: Checking VMFS header and resource files

   Detected VMFS file system (labeled:'raid') with UUID:59cd3f06-381b631f-8b26-002590707790, Version 5:81

ON-DISK ERROR: corrupted SFD address <FB c14268 r687> for resource file PB2

         ERROR: Failed to check pb2.sf.

   VOMA failed to check device : Severe corruption detected

Total Errors Found:           1

   Kindly Consult VMware Support for further assistance

Is there anyway to do a repair on the device in question?

Thanks for any help

Tags (1)
Reply
0 Kudos
8 Replies
Finikiez
Champion
Champion

Hi!

It looks like that you have broken metadata which VOMA can't fix.

Do you see any storage errors in vmkernel.log for the device with missing VMFS?

Reply
0 Kudos
bct0109
Contributor
Contributor

I see these errors.  I'm not understanding why the adaptec driver is dumping core:

2018-03-11T14:30:09.169Z cpu1:68070 opID=e755d70d)WARNING: Fil3: 1373: Failed to reserve volume f530 28 1 59cd3f06 381b631f 25008b26 90777090 0 0 0 0 0 0 0

2018-03-11T14:30:09.169Z cpu1:68070 opID=e755d70d)Vol3: 3091: Failed to get object 28 type 2 uuid 59cd3f06-381b631f-8b26-002590707790 FD 4 gen 1 :Not found

2018-03-11T14:30:09.212Z cpu1:68070 opID=e755d70d)Vol3: 3091: Failed to get object 28 type 1 uuid 59cd3f06-381b631f-8b26-002590707790 FD 0 gen 0 :Not found

2018-03-11T14:30:09.212Z cpu1:68070 opID=e755d70d)WARNING: Fil3: 1373: Failed to reserve volume f530 28 1 59cd3f06 381b631f 25008b26 90777090 0 0 0 0 0 0 0

2018-03-11T14:30:09.213Z cpu1:68070 opID=e755d70d)Vol3: 3091: Failed to get object 28 type 2 uuid 59cd3f06-381b631f-8b26-002590707790 FD 4 gen 1 :Not found

2018-03-11T14:31:58.248Z cpu11:68726)Resv: 407: Executed out-of-band reserve on eui.30f9584100d00000

2018-03-11T14:31:58.250Z cpu11:68726)Resv: 407: Executed out-of-band release on eui.30f9584100d00000

2018-03-11T14:32:58.095Z cpu4:68736)Resv: 407: Executed out-of-band reserve on eui.30f9584100d00000

2018-03-11T14:32:58.097Z cpu4:68736)Resv: 407: Executed out-of-band release on eui.30f9584100d00000

2018-03-11T14:57:06.456Z cpu2:68121)User: 3089: sfcb-adaptec_st: wantCoreDump:sfcb-adaptec_st signal:11 exitCode:0 coredump:enabled

2018-03-11T14:57:06.569Z cpu2:68121)UserDump: 3024: sfcb-adaptec_st: Dumping cartel 68090 (from world 68121) to file /var/core/sfcb-adaptec_st-zdump.001 ...

2018-03-11T14:57:13.040Z cpu20:68121)UserDump: 3172: sfcb-adaptec_st: Userworld(sfcb-adaptec_st) coredump complete.

Reply
0 Kudos
continuum
Immortal
Immortal

> Failed to check pb2.sf.
This is a very serious error. The VMFS-metadata appears to be corrupt.
Please read my instructions here:
http://vm-sickbay.com/create-a-vmfs-header-dump-using-an-esxi-host-in-production
If you give me a download link for your dump-file I will look into it.
Ulli


________________________________________________
Do you need support with a VMFS recovery problem ? - send a message via skype "sanbarrow"
I do not support Workstation 16 at this time ...

Reply
0 Kudos
bct0109
Contributor
Contributor

Reply
0 Kudos
continuum
Immortal
Immortal

Your dumpfile is incomplete - it should be either an archive or if you did not compress it - then the file should be 1536 MB but it is only 256 mb.
Please try again or call me so that we can do it together.
Ulli


________________________________________________
Do you need support with a VMFS recovery problem ? - send a message via skype "sanbarrow"
I do not support Workstation 16 at this time ...

Reply
0 Kudos
bct0109
Contributor
Contributor

Sorry about that looks like /tmp didn't have the space so the file was truncated

https://drive.google.com/open?id=1RkEJtYNmehP3alkymrF5egPovFmIbsiJ

Reply
0 Kudos
continuum
Immortal
Immortal

Unfortunately the VMFS-volume really is seriously damaged. A repair looks impossible.
To add to that most of your VMDKs are thin-provisioned.
Given the facts that you used too many thin provisioned vmdks and that you are probably using ESXi 6 or maybe even 6.5 the chance to recover your data is not that good.
Eventually recoverable flat vmdks are:

"BRIAN-TEST-flat.vmdk"

"BRIAN-TEST_2-flat.vmdk"

"Brian-PC2-flat.vmdk"

"Brian-PC3-flat.vmdk"

"LinuxMint-flat.vmdk"

"Solutions Integration Service-flat.vmdk"

"TEST-VMDK-flat.vmdk"

"USXXTAYLOB9L1C.corp.emc.com-flat.vmdk"

"Ubuntu-flat.vmdk"

"Win2012_1-flat.vmdk"

"Win2012_2-flat.vmdk"

"Win2012_3-flat.vmdk"

"Win2012_4-flat.vmdk"

"smtp-server-flat.vmdk"

"vCenter-flat.vmdk"

"vCenter_1-flat.vmdk"

"vCenter_10-flat.vmdk"

"vCenter_11-flat.vmdk"

"vCenter_2-flat.vmdk"

"vCenter_3-flat.vmdk"

"vCenter_4-flat.vmdk"

"vCenter_5-flat.vmdk"

"vCenter_6-flat.vmdk"

"vCenter_7-flat.vmdk"

"vCenter_8-flat.vmdk"

"vCenter_9-flat.vmdk"

"win10-test-flat.vmdk"

"win2012-1-flat.vmdk"

"win2012-1_1-flat.vmdk"

I suggest you call me via skype

Ulli


________________________________________________
Do you need support with a VMFS recovery problem ? - send a message via skype "sanbarrow"
I do not support Workstation 16 at this time ...

Reply
0 Kudos
jaredhallen
Contributor
Contributor

Good afternoon. I realize I'm digging up a pretty old thread, here, but I'm wondering where this landed. I've encountered the exact same error. We lost one drive in a three disk RAID 5, and for whatever reason the P400 RAID controller disabled the volume. We were able to replace the drive and get the RAID to rebuild, but the VMFS file system still won't mount. We're seeing that same:

"ON-DISK ERROR: corrupted SFD address <INVALID c0 r0> for resource file PB2." Any chance anyone has any ideas?

Reply
0 Kudos