ESXi

 View Only
Expand all | Collapse all

Issues with the esxi datastore after Synology LUN migration

  • 1.  Issues with the esxi datastore after Synology LUN migration

    Posted Sep 05, 2023 02:25 PM

    Hi!

    Three esxi v6.0.0 servers had iSCSI configured on a Synology NAS server that went down. We bought a new NAS server, fixed the old one and migrated the LUN from the old one to the new one. After trying to connect one of the esxi hosts found that the current iSCSI Software Adapter on it contains information about the old connection. I deleted the previous connection in Static Discovery and created a new one by Dynamic Discovery. Under Devices and Path I could not delete the entry, it is frozen:

    dsd7150_0-1693922352016.png

    However, I decided to connect the esxi host to the restored LUN. I added it via Discovery, then clicked Rescan Adapter, then Rescan Storage. On the NAS server, I saw an iSCSI connection. But the problem is that I didn't see the previous configured array that should have appeared in the Datastores section of the esxi host after this procedure. It's gone. Why is that? It seems that ESXi considers it as a new device. How do I get back an FS1-iSCSI named store that was previously attached to an old LUN?

    dsd7150_1-1693922635210.png

    But it is still configured and in an Inaccessible state on the other two esxi hosts.

    dsd7150_2-1693922651532.png

    Screenshot from SAN Manager of Synology NAS (LUN-1 was backuped by Hyper Backup package on Synology and recovered by Hyper Backup Vault):

    dsd7150_3-1693922876897.png

    Another screenshots of 192.168.140.102 esxi host:

    Devices:

    dsd7150_5-1693923186923.png

    Path:

    dsd7150_6-1693923201469.png

    Dynamic Discovery:

    dsd7150_7-1693923246957.png

    Static Discovery:

    dsd7150_8-1693923279805.png

    Any help is appreciated.



  • 2.  RE: Issues with the esxi datastore after Synology LUN migration

    Posted Sep 05, 2023 03:30 PM

    Check to make sure it's not being recognised as a snapshot

    SSH onto a host and use this command to find if it is being recognised as a a snapshot

     "esxcli storage vmfs snapshot list"

    This will return the Volume name and a VMFS UUID 

    Then using the UUID you can mount it using this command

    esxcfg-volume -M "vmfs uuid"



  • 3.  RE: Issues with the esxi datastore after Synology LUN migration

    Posted Sep 06, 2023 05:27 AM

    [root@esxi2:~] esxcli storage vmfs snapshot list
    569de90e-e375c969-095d-6cc21735c080
    Volume Name: FS1-iSCSI
    VMFS UUID: 569de90e-e375c969-095d-6cc21735c080
    Can mount: false
    Reason for un-mountability: the original volume has some extents online
    Can resignature: true
    Reason for non-resignaturability:
    Unresolved Extent Count: 1
    [root@esxi2:~] esxcfg-volume -M 569de90e-e375c969-095d-6cc21735c080
    Persistently mounting volume 569de90e-e375c969-095d-6cc21735c080
    Error: Unable to mount this VMFS volume due to the original volume has some extents online
    [root@esxi2:~]

     


  • 4.  RE: Issues with the esxi datastore after Synology LUN migration
    Best Answer

    Posted Sep 06, 2023 10:29 AM

    OK I believe you'll need to re-signature so use this command

    esxcli storage vmfs snapshot resignature -u "vmfs UUID"

    run "esxcli storage vmfs snapshot list" again, it should return nothing and then run "esxcfg-volume -l" and you should see a new volume looking like SNAP-random number-original volume name



  • 5.  RE: Issues with the esxi datastore after Synology LUN migration

    Posted Sep 06, 2023 12:48 PM

    Thank you! It's worked!

    /vmfs/volumes/64f8723b-413a2505-3133-2c768a51c03c snap-4659cc8a-FS1-iSCSI 64f8723b-413a2505-3133-2c768a51c03c true VMFS-5 10737149804544 27025997824

    As far as I understand, all I have to do is rename the storedge from snap-4659cc8a-FS1-iSCSI back to FS1-iSCSI and do the same with the remaining esxi servers. 

    What about a machine that was registered on this esxi server that was located on this storage? It is frozen in the vSphere inventory in an inaccessible state.

     

     



  • 6.  RE: Issues with the esxi datastore after Synology LUN migration

    Posted Sep 06, 2023 12:52 PM

    And is there a way to delete an old disk which is with "Not Cosumed" datastore?

    dsd7150_0-1694004652646.png

     



  • 7.  RE: Issues with the esxi datastore after Synology LUN migration

    Posted Sep 06, 2023 01:44 PM

    Probably easiest to re-mount the old volume and then delete what you don't want via standard process

    identify all volumes available "esxcfg-volume –l"

    using the identified vmfs UUID for the "old" datastore mount it using

    esxcfg-volume –M "vmfs uuid"



  • 8.  RE: Issues with the esxi datastore after Synology LUN migration

    Posted Sep 07, 2023 06:51 AM

    I don't think it's possible to do that after re-signing the snapshot. Especially the snapshot list is empty if I do esxcfg-volume -l. I think a reboot would help here. I can't test it yet. I removed the virtual machine for now from Inventory that was located on a disconnected iSCSI disk and that was inaccessible state. One of the esxi in the cluster immediately saw the snap-4659cc8a-FS1-iSCSI storage as soon as I added the new address and did a Refresh Adapter. Seems everything is ok.

    One problem left, a real pain in the ass, is a VM whose disk was referencing an iSCSI array at the time of the sudden outage. The server is loaded with phone calls (Asterisk server is running on it) and shutting it down is not desirable The disk is installed as a second (/dev/sdb) in the VM (4 disks in total) and is mounted as ext4 in the OS, the others are in LVMs that were added to / as space ran out, and it looks like the disks were added by name (ie /dev/sda, /dev/sdc, /dev/sdd) in LVM rather than via UUID (I see use_lvmetad=0). This is an old server on CentOS 6.5. Perhaps after a forced shutdown (to remove the iscsi disk and re-add it) the disk order will change both in the VM itself and in the OS, and the VM will most likely stop booting due to LVM issues. For now the server is still running, but I can't request pvs or vgs (LVM), the system hangs. I am afraid to reconfigure it to use UUID disks on a production system, I have never had such experience. I need to somehow reproduce the problem on a test machine and do an experiment.

    dsd7150_0-1694068237241.png

    The second option is to change the disk on the working system. Is it possible to make the VM refer to a resigned iSCSI storage without rebooting? The disk is still the same on a storage, it's just in a different storage due to re-signing. I can't even unmount the disk in the OS yet, it's in busy state. I tried umount -f /data, fuser -km /data and even kill -9 PID - there is an updatedb process that I can't kill, it is using a partition of this hung disk, the process itself is in DN (Uninterruptible sleep with low-priority) state.

    Anyway, I'm thinking about what to do with that VM.



  • 9.  RE: Issues with the esxi datastore after Synology LUN migration

    Posted Sep 07, 2023 11:48 AM

    Sorry, I'm struggling with this one as well but my Gut instinct would be to take a snapshot and then restart the VM.



  • 10.  RE: Issues with the esxi datastore after Synology LUN migration

    Posted Sep 19, 2023 06:59 AM

    Hi,

    A note on how I solved the storage renaming problem. After resignature procedure, the iSCSI storage named FS1-iSCSI became snap-4659cc8a-FS1-iSCSI. It cannot be renamed as long as the old storage is referenced by VMs. In my case, it was a VM with an iSCSI attached disk that was hosted on that storage and a VM with Veeambackup that was hosted on the storage itself. First, I shut down the VM that had the disk connected via ISCSI and removed the disk from the virtual machine (without removing the disk itself), then added it from snap-4659cc8a-FS1-iSCSI storage. LVM problems on the VM itself thankfully did not occur. After that, I removed the Veeambackup host from Inventory, then went to snap-4659cc8a-FS1-iSCSI storage via Browse, selected VM Veeambackup and right-clicked on the .vmx to register the machine back to Inventory. After other objects stopped referencing the old storage, it disappeared from the storage list, which gave snap-4659cc8a-FS1-iSCSI a rename back to FS1-iSCSI. Right click on the storage and rename it.

    About the same reason was with mount point over NFS (volume number changed there, so I had to remount every esxi host in the cluster). On the old repository, Public was messing around on NFS, from where we took ISO images and rolled out the VMs. Some VMs referenced the ISOs in their DVD drives, which we had to delete. And change the mount pount on each esxi host:

    esxcli storage nfs list
    esxcli storage nfs remove -v Public
    esxcli storage nfs add -H 192.168.X.X -s /volume2/Public -v Public

    It is also important to remember, virtual machines storing multiple snapshots need to consolidate them before this procedure.

    The following articles helped:

    Troubleshooting LUNs detected as snapshot LUNs in vSphere (1011387)
    https://kb.vmware.com/s/article/1011387

    How to register or add a Virtual Machine (VM) to the vSphere Inventory in vCenter Server (1006160)
    https://kb.vmware.com/s/article/1006160

    Consolidating/Committing snapshots in VMware ESXi (1002310)
    https://kb.vmware.com/s/article/1002310

    Remounting a disconnected NFS datastore from the ESXi command line (1005057)
    https://kb.vmware.com/s/article/1005057

    Also a big thanks to battybishop!



  • 11.  RE: Issues with the esxi datastore after Synology LUN migration

    Posted Sep 19, 2023 09:43 AM

    Excellent news that it's now all resolved, so glad I could help in part of this resolution 



  • 12.  RE: Issues with the esxi datastore after Synology LUN migration

    Posted Sep 06, 2023 01:43 PM

    I had issues renaming recovered volumes like this in the past and always found it easiest to create a new datastore and migrate the VM's/Disks over then cleanly removing reference to all the old names etc. If you are successful in renaming then once it's done on one host you just need to re-scan the other hosts.