Hi!
Three esxi v6.0.0 servers had iSCSI configured on a Synology NAS server that went down. We bought a new NAS server, fixed the old one and migrated the LUN from the old one to the new one. After trying to connect one of the esxi hosts found that the current iSCSI Software Adapter on it contains information about the old connection. I deleted the previous connection in Static Discovery and created a new one by Dynamic Discovery. Under Devices and Path I could not delete the entry, it is frozen:
However, I decided to connect the esxi host to the restored LUN. I added it via Discovery, then clicked Rescan Adapter, then Rescan Storage. On the NAS server, I saw an iSCSI connection. But the problem is that I didn't see the previous configured array that should have appeared in the Datastores section of the esxi host after this procedure. It's gone. Why is that? It seems that ESXi considers it as a new device. How do I get back an FS1-iSCSI named store that was previously attached to an old LUN?
But it is still configured and in an Inaccessible state on the other two esxi hosts.
Screenshot from SAN Manager of Synology NAS (LUN-1 was backuped by Hyper Backup package on Synology and recovered by Hyper Backup Vault):
Another screenshots of 192.168.140.102 esxi host:
Devices:
Path:
Dynamic Discovery:
Static Discovery:
Any help is appreciated.
OK I believe you'll need to re-signature so use this command
esxcli storage vmfs snapshot resignature -u "vmfs UUID"
run "esxcli storage vmfs snapshot list" again, it should return nothing and then run "esxcfg-volume -l" and you should see a new volume looking like SNAP-random number-original volume name
Check to make sure it's not being recognised as a snapshot
SSH onto a host and use this command to find if it is being recognised as a a snapshot
"esxcli storage vmfs snapshot list"
This will return the Volume name and a VMFS UUID
Then using the UUID you can mount it using this command
esxcfg-volume -M "vmfs uuid"
[root@esxi2:~] esxcli storage vmfs snapshot list
569de90e-e375c969-095d-6cc21735c080
Volume Name: FS1-iSCSI
VMFS UUID: 569de90e-e375c969-095d-6cc21735c080
Can mount: false
Reason for un-mountability: the original volume has some extents online
Can resignature: true
Reason for non-resignaturability:
Unresolved Extent Count: 1
[root@esxi2:~] esxcfg-volume -M 569de90e-e375c969-095d-6cc21735c080
Persistently mounting volume 569de90e-e375c969-095d-6cc21735c080
Error: Unable to mount this VMFS volume due to the original volume has some extents online
[root@esxi2:~]
OK I believe you'll need to re-signature so use this command
esxcli storage vmfs snapshot resignature -u "vmfs UUID"
run "esxcli storage vmfs snapshot list" again, it should return nothing and then run "esxcfg-volume -l" and you should see a new volume looking like SNAP-random number-original volume name
Thank you! It's worked!
/vmfs/volumes/64f8723b-413a2505-3133-2c768a51c03c snap-4659cc8a-FS1-iSCSI 64f8723b-413a2505-3133-2c768a51c03c true VMFS-5 10737149804544 27025997824
As far as I understand, all I have to do is rename the storedge from snap-4659cc8a-FS1-iSCSI back to FS1-iSCSI and do the same with the remaining esxi servers.
What about a machine that was registered on this esxi server that was located on this storage? It is frozen in the vSphere inventory in an inaccessible state.
And is there a way to delete an old disk which is with "Not Cosumed" datastore?
I had issues renaming recovered volumes like this in the past and always found it easiest to create a new datastore and migrate the VM's/Disks over then cleanly removing reference to all the old names etc. If you are successful in renaming then once it's done on one host you just need to re-scan the other hosts.
Probably easiest to re-mount the old volume and then delete what you don't want via standard process
identify all volumes available "esxcfg-volume –l"
using the identified vmfs UUID for the "old" datastore mount it using
esxcfg-volume –M "vmfs uuid"
I don't think it's possible to do that after re-signing the snapshot. Especially the snapshot list is empty if I do esxcfg-volume -l. I think a reboot would help here. I can't test it yet. I removed the virtual machine for now from Inventory that was located on a disconnected iSCSI disk and that was inaccessible state. One of the esxi in the cluster immediately saw the snap-4659cc8a-FS1-iSCSI storage as soon as I added the new address and did a Refresh Adapter. Seems everything is ok.
One problem left, a real pain in the ass, is a VM whose disk was referencing an iSCSI array at the time of the sudden outage. The server is loaded with phone calls (Asterisk server is running on it) and shutting it down is not desirable 🙂 The disk is installed as a second (/dev/sdb) in the VM (4 disks in total) and is mounted as ext4 in the OS, the others are in LVMs that were added to / as space ran out, and it looks like the disks were added by name (ie /dev/sda, /dev/sdc, /dev/sdd) in LVM rather than via UUID (I see use_lvmetad=0). This is an old server on CentOS 6.5. Perhaps after a forced shutdown (to remove the iscsi disk and re-add it) the disk order will change both in the VM itself and in the OS, and the VM will most likely stop booting due to LVM issues. For now the server is still running, but I can't request pvs or vgs (LVM), the system hangs. I am afraid to reconfigure it to use UUID disks on a production system, I have never had such experience. I need to somehow reproduce the problem on a test machine and do an experiment.
The second option is to change the disk on the working system. Is it possible to make the VM refer to a resigned iSCSI storage without rebooting? The disk is still the same on a storage, it's just in a different storage due to re-signing. I can't even unmount the disk in the OS yet, it's in busy state. I tried umount -f /data, fuser -km /data and even kill -9 PID - there is an updatedb process that I can't kill, it is using a partition of this hung disk, the process itself is in DN (Uninterruptible sleep with low-priority) state.
Anyway, I'm thinking about what to do with that VM.
Sorry, I'm struggling with this one as well but my Gut instinct would be to take a snapshot and then restart the VM.
Hi,
A note on how I solved the storage renaming problem. After resignature procedure, the iSCSI storage named FS1-iSCSI became snap-4659cc8a-FS1-iSCSI. It cannot be renamed as long as the old storage is referenced by VMs. In my case, it was a VM with an iSCSI attached disk that was hosted on that storage and a VM with Veeambackup that was hosted on the storage itself. First, I shut down the VM that had the disk connected via ISCSI and removed the disk from the virtual machine (without removing the disk itself), then added it from snap-4659cc8a-FS1-iSCSI storage. LVM problems on the VM itself thankfully did not occur. After that, I removed the Veeambackup host from Inventory, then went to snap-4659cc8a-FS1-iSCSI storage via Browse, selected VM Veeambackup and right-clicked on the .vmx to register the machine back to Inventory. After other objects stopped referencing the old storage, it disappeared from the storage list, which gave snap-4659cc8a-FS1-iSCSI a rename back to FS1-iSCSI. Right click on the storage and rename it.
About the same reason was with mount point over NFS (volume number changed there, so I had to remount every esxi host in the cluster). On the old repository, Public was messing around on NFS, from where we took ISO images and rolled out the VMs. Some VMs referenced the ISOs in their DVD drives, which we had to delete. And change the mount pount on each esxi host:
esxcli storage nfs list
esxcli storage nfs remove -v Public
esxcli storage nfs add -H 192.168.X.X -s /volume2/Public -v Public
It is also important to remember, virtual machines storing multiple snapshots need to consolidate them before this procedure.
The following articles helped:
Troubleshooting LUNs detected as snapshot LUNs in vSphere (1011387)
https://kb.vmware.com/s/article/1011387
How to register or add a Virtual Machine (VM) to the vSphere Inventory in vCenter Server (1006160)
https://kb.vmware.com/s/article/1006160
Consolidating/Committing snapshots in VMware ESXi (1002310)
https://kb.vmware.com/s/article/1002310
Remounting a disconnected NFS datastore from the ESXi command line (1005057)
https://kb.vmware.com/s/article/1005057
Also a big thanks to battybishop!
Excellent news that it's now all resolved, so glad I could help in part of this resolution 😁