Hey guys, hopefully somebody can help with this or point me in the right direction. I lost one of my datastores after rebooting an eSXI 6.7.0 host (VMs were shut down and host was in maintenance mode), and it no longer shows up in the storage/datastore tab of esxi.
However, the VMFS partition is still displayed when viewing the storage device structure. VOMA shows output as below, I would assume the ON-DISK ERROR is the culprit. Manually mounting the uuid doesn't work, and VOMA doesn't have a fix option for VMFS-6 yet, so I'm not sure where to go from here. Hopefully someone can point me in the right direction, thanks in advance.
Phase 1: Checking VMFS header and resource files
Detected VMFS-6 file system (labeled:'Primary') with UUID:5b0440a2-7dbb4c4b-de69-a0369fe03066, Version 6:82
Found stale lock [type 10c00003 offset 286449664 v 2, hb offset 3837952
gen 1, mode 1, owner 5baba25d-063a88f4-62a5-a0369fe03066 mtime 37
num 0 gblnum 0 gblgen 0 gblbrk 0]
Found stale lock [type 10c00003 offset 15070576640 v 2, hb offset 3833856
gen 103, mode 1, owner 5bab9ade-3cf65242-a144-a0369fe03066 mtime 429
num 0 gblnum 0 gblgen 0 gblbrk 0]
Found stale lock [type 10c00008 offset 16195584 v 6, hb offset 3837952
gen 1, mode 1, owner 5baba25d-063a88f4-62a5-a0369fe03066 mtime 81
num 0 gblnum 0 gblgen 0 gblbrk 0]
Found stale lock [type 10c00002 offset 9928704 v 6, hb offset 3837952
gen 1, mode 1, owner 5baba25d-063a88f4-62a5-a0369fe03066 mtime 35
num 0 gblnum 0 gblgen 0 gblbrk 0]
Found stale lock [type 10c00002 offset 16392192 v 6, hb offset 3837952
gen 1, mode 1, owner 5baba25d-063a88f4-62a5-a0369fe03066 mtime 29
num 0 gblnum 0 gblgen 0 gblbrk 0]
Cluster 785 unmap lock set while no pending unmaps, stale lock
ON-DISK ERROR: Cluster 785 free locked for unmap 457 should be 224
Found stale lock [type 10c00002 offset 16465920 v 4, hb offset 3837952
gen 1, mode 1, owner 5baba25d-063a88f4-62a5-a0369fe03066 mtime 32
num 0 gblnum 0 gblgen 0 gblbrk 0]
Phase 2: Checking VMFS heartbeat region
Marking Journal addr (14, 0) in use
Phase 3: Checking all file descriptors.
Phase 4: Checking pathname and connectivity.
Phase 5: Checking resource reference counts.
Total Errors Found: 1
Also the vmkernel log also shows this warning several times
2018-09-26T17:13:18.685Z cpu2:2097320)WARNING: Vol3: 3102: Primary/5b0440a2-7dbb4c4b-de69-a0369fe03066: Invalid physDiskBlockSize 512
Hello
have a look at Locked files with VMFS 6 | VM-Sickbay
If necessary create a VMFS header dump if you want me to have a closer look - see
Create a VMFS-Header-dump using an ESXi-Host in production | VM-Sickbay
Ulli
I've made a header backup and uploaded it here and attached it. Replacing the heartbeat section with a clean one did not resolve the issue, this header dump is prior to overwriting the corrupted partition's heartbeat section. Thanks for your help so far.
Edit: also here's a new voma output
Checking if device is actively used by other hosts
Scanning for VMFS-6 host activity (4096 bytes/HB, 1024 HBs).
Running VMFS Checker version 2.1 in default mode
Initializing LVM metadata, Basic Checks will be done
Phase 1: Checking VMFS header and resource files
Detected VMFS-6 file system (labeled:'Primary') with UUID:5b0440a2-7dbb4c4b-de69-a0369fe03066, Version 6:82
Found stale lock [type 10c00003 offset 286449664 v 2, hb offset 3837952
gen 1, mode 1, owner 5baba25d-063a88f4-62a5-a0369fe03066 mtime 37
num 0 gblnum 0 gblgen 0 gblbrk 0]
Found stale lock [type 10c00003 offset 15070576640 v 2, hb offset 3833856
gen 103, mode 1, owner 5bab9ade-3cf65242-a144-a0369fe03066 mtime 429
num 0 gblnum 0 gblgen 0 gblbrk 0]
Found stale lock [type 10c00008 offset 16195584 v 6, hb offset 3837952
gen 1, mode 1, owner 5baba25d-063a88f4-62a5-a0369fe03066 mtime 81
num 0 gblnum 0 gblgen 0 gblbrk 0]
Found stale lock [type 10c00002 offset 9928704 v 6, hb offset 3837952
gen 1, mode 1, owner 5baba25d-063a88f4-62a5-a0369fe03066 mtime 35
num 0 gblnum 0 gblgen 0 gblbrk 0]
Found stale lock [type 10c00002 offset 16392192 v 6, hb offset 3837952
gen 1, mode 1, owner 5baba25d-063a88f4-62a5-a0369fe03066 mtime 29
num 0 gblnum 0 gblgen 0 gblbrk 0]
Cluster 785 unmap lock set while no pending unmaps, stale lock
ON-DISK ERROR: Cluster 785 free locked for unmap 457 should be 224
Found stale lock [type 10c00002 offset 16465920 v 4, hb offset 3837952
gen 1, mode 1, owner 5baba25d-063a88f4-62a5-a0369fe03066 mtime 32
num 0 gblnum 0 gblgen 0 gblbrk 0]
Phase 2: Checking VMFS heartbeat region
Phase 3: Checking all file descriptors.
Phase 4: Checking pathname and connectivity.
Phase 5: Checking resource reference counts.
ON-DISK ERROR: JBC inconsistency found: (14,0) allocated in bitmap, but never used
Total Errors Found: 2
Just downloaded the dump ...
This is a tough one ...
OSF-Windows-Server-2016 seems readable , OSF-CentOS-Plesk has a problem.
I will definetely need more time for this
Ulli
The Plesk VM is not entirely necessary I have a pretty recent complete backup of it
Please run the command
dd if=/dev/disks/device bs=1M count=10 skip=278540 of=tmp/test.bin
device is the same as you used to create the vmfs-header dump
Download /tmp/test.bin
Compress the file and attach it to your next reply.
Please look at this partitiontable - is this the Windows-bootdisk you need ?
If yes - install Anydesk and call me / send a message via skype.
Ulli
Please let me know if you are still interested.
The success rate of such operations is much better if there is no unnecessary delay between each steps ....
Yes I am the partition table looks about right for the windows disk. I'll contact you on Skype shortly