I had a scenario where some of my ESXi hosts (managed by vCenter Server Appliance) were ungracefully disconnected from a datastore.
This datastore was on a LUN which was located on a Synology NAS, DS1517+.
I reestablished the iSCSI connection between two of the hosts and the LUN, but now rescanning the storage device does not show the datastore.
The main problem here is that a number of important VMs are on that datastore, so I would like to recover it if possible.
I have a backup of the full LUN (not of individual VMs), which I haven't used yet as it would override the existing LUN, would like to save that as a last ditch effort.
Here's what debug information I can provide thus far, maybe some of you can piece together what's happening.
I will openly admit I'm quite green when it comes to VMware, and any help towards solving this is immensely appreciated.
[root@XXXvspherehost1:~] esxcli storage core path list
...
iqn.1998-01.com.vmware:cssvspherehost1-658ace6e-00023d000002,iqn.2008-06.com.css-design:StorageCluster.VM-Target,t,1-naa.60014056f5e3bd8d68c9d4139dbaded5
UID: iqn.1998-01.com.vmware:cssvspherehost1-658ace6e-00023d000002,iqn.2008-06.com.css-design:StorageCluster.VM-Target,t,1-naa.60014056f5e3bd8d68c9d4139dbaded5
Runtime Name: vmhba37:C0:T0:L1
Device: naa.60014056f5e3bd8d68c9d4139dbaded5
Device Display Name: SYNOLOGY iSCSI Disk (naa.60014056f5e3bd8d68c9d4139dbaded5)
Adapter: vmhba37
Channel: 0
Target: 0
LUN: 1
Plugin: NMP
State: active
Transport: iscsi
Adapter Identifier: iqn.1998-01.com.vmware:XXXvspherehost1-658ace6e
Target Identifier: 00023d000002,iqn.2008-06.com.XXX:StorageCluster.VM-Target,t,1
Adapter Transport Details: iqn.1998-01.com.vmware:XXXvspherehost1-658ace6e
Target Transport Details: IQN=iqn.2008-06.com.XXX:StorageCluster.VM-Target Alias= Session=00023d000002 PortalTag=1
Maximum IO Size: 131072
[root@XXXvspherehost1:~] esxcli storage core device list
naa.60014056f5e3bd8d68c9d4139dbaded5
Display Name: SYNOLOGY iSCSI Disk (naa.60014056f5e3bd8d68c9d4139dbaded5)
Has Settable Display Name: true
Size: 2048000
Device Type: Direct-Access
Multipath Plugin: NMP
Devfs Path: /vmfs/devices/disks/naa.60014056f5e3bd8d68c9d4139dbaded5
Vendor: SYNOLOGY
Model: iSCSI Storage
Revision: 4.0
SCSI Level: 5
Is Pseudo: false
Status: degraded
Is RDM Capable: true
Is Local: false
Is Removable: false
Is SSD: false
Is VVOL PE: false
Is Offline: false
Is Perennially Reserved: false
Queue Full Sample Size: 0
Queue Full Threshold: 0
Thin Provisioning Status: yes
Attached Filters:
VAAI Status: unknown
Other UIDs: vml.020001000060014056f5e3bd8d68c9d4139dbaded5695343534920
Is Shared Clusterwide: true
Is Local SAS Device: false
Is SAS: false
Is USB: false
Is Boot USB Device: false
Is Boot Device: false
Device Max Queue Depth: 128
No of outstanding IOs with competing worlds: 32
Drive Type: unknown
RAID Level: unknown
Number of Physical Drives: unknown
Protection Enabled: false
PI Activated: false
PI Type: 0
PI Protection Mask: NO PROTECTION
Supported Guard Types: NO GUARD SUPPORT
DIX Enabled: false
DIX Guard Type: NO GUARD SUPPORT
Emulated DIX/DIF Enabled: false
[root@XXXvspherehost1:~] esxcli storage vmfs extent list
Volume Name VMFS UUID Extent Number Device Name Partition
--------------------- ----------------------------------- ------------- -------------------------------------------------------------------------- ---------
datastore1 57b5829e-792ba3ad-e735-f48e38c4e28a 0 t10.ATA_____WDC_WD5003ABYX2D18WERA0_______________________WD2DWMAYP0K9LHZU 3
XXX-iscsi-datastore-1 5afda8ec-359e5ffb-30b1-f48e38c4e28a 0 naa.60014056f5e3bd8d68c9d4139dbaded5 1
[root@XXXvspherehost1:~] esxcli storage filesystem list
Error getting data for filesystem on '/vmfs/volumes/5afda8ec-359e5ffb-30b1-f48e38c4e28a': Cannot open volume: /vmfs/volumes/5afda8ec-359e5ffb-30b1-f48e38c4e28a, skipping.
[root@XXXvspherehost1:~] voma -m vmfs -f check -d /vmfs/devices/disks/naa.60014056f5e3bd8d68c9d4139dbaded5
Checking if device is actively used by other hosts
Running VMFS Checker version 1.2 in check mode
Initializing LVM metadata, Basic Checks will be done
Phase 1: Checking VMFS header and resource files
Detected VMFS file system (labeled:'XXX-iscsi-datastore-1') with UUID:5afda8ec-359e5ffb-30b1-f48e38c4e28a, Version 5:61
Phase 2: Checking VMFS heartbeat region
Phase 3: Checking all file descriptors.
Phase 4: Checking pathname and connectivity.
Phase 5: Checking resource reference counts.
ON-DISK ERROR: FB inconsistency found: (7925,1) allocated in bitmap, but never used
ON-DISK ERROR: FB inconsistency found: (7925,2) allocated in bitmap, but never used
ON-DISK ERROR: FB inconsistency found: (7925,4) allocated in bitmap, but never used
Total Errors Found: 3
I ended up figuring out the issue, the LUNs had been corrupted after the disconnection.
Luckily, I had a backup and was able to get the backup up and running.
Looking at this issue, I appears everything seen here is a result of ESXi attempting to mount the datastore but being unable to due to the corruption of the LUN.
Thanks everyone for your help!
If the LUN was abruptly unpresented from the hosts with active VMs residing on the datastore, the hosts either have to be rebooted or the datastore related tasks should be killed before the datastore can be remounted once the presentation issue has been fixed. More details in the article - VMware Knowledge Base
Cheers,
Supreet
I ended up figuring out the issue, the LUNs had been corrupted after the disconnection.
Luckily, I had a backup and was able to get the backup up and running.
Looking at this issue, I appears everything seen here is a result of ESXi attempting to mount the datastore but being unable to due to the corruption of the LUN.
Thanks everyone for your help!