2 Replies Latest reply on Jun 22, 2020 10:53 PM by DeMichel93

    SRM 8.3 and vSphere HA - Recovery is stuck at "Delete file"

    DeMichel93 Lurker

      Hello,

      I'm using SRM 8.3 with Stretched Storage from PureStorage (ActiveCluster) and vSphere HA Enabled on both sides but I encountered a problem, When I try to Recover VMs with vMotion Recovery enabled via SRM the recovery plan is stuck at step "19.1.2.1 Recover storage consistency group".

      At this time Recovery plan/SRM tries to remove files from the protected datastore, to be exact, it tries to remove "vSphere-HA/FDM-stringoflettersandnumber-vcxx" folder but it's unable to since vSphere HA is enabled on both sides, when I disable vSphere HA SRM proceeds without a hassle and just continues to recover the VMs. I tried to select nonprotected, by SRM, datastores for heartbeats but vSphere HA still creates those folders on protected datastores and SRM is still unable to proceed with the recovery. I also tried to disable VM Monitoring in vSphere HA settings but to no avail. SRM just loops and tries to delete the files/folder. Anybody got any tips for this behaviour?

       

      vCenter 6.5, SRM 8.3.02, PureStorage SRA 3.1

        • 1. Re: SRM 8.3 and vSphere HA - Recovery is stuck at "Delete file"
          ashilkrishnan Enthusiast
          VMware Employees

          Hi,

           

          1. Do you get any errors when plan eventually fails/times out?  Please share screenshot, if possible.

          2. Any custom HA settings for affected VM ?

          3. How many VMs are reporting this symptom ?

          4. Do you face similar issues when trying to manually do a cross vCenter vMotion of a non-protected VM ?

          • 2. Re: SRM 8.3 and vSphere HA - Recovery is stuck at "Delete file"
            DeMichel93 Lurker

            Hello, thank you for answering.

             

            1. Do you get any errors when plan eventually fails/times out?  Please share screenshot, if possible.

             

            I waited for about 20 minutes and no fail/timeout, just repeated tasks trying to delete files.

            I disabled HA on Protected Site from which the VM is "recovered" and that helped, Delete file task completed as well as Recovery Plan changed to Recovery Complete status. Disabling HA on Recovery Site does nothing at this point.

            BTW, the "recovered" machine, by the time SRM tries to delete files, has already been vMotion'ed to the recovery site and working.

            2. Any custom HA settings for affected VM ?

            No, HA is pretty much just enabled on Cluster, no custom settings are enabled on specific VM. I tried to change Heartbeat Datastores to one's that are not protected by SRM but this does nothing.

             

            3. How many VMs are reporting this symptom ?

             

            I've tested three separate protection groups (separate datastores) with their own recovery plans and all of the have the same syptom, they are all stuck at step 19.1.2.1. when SRM is trying to delete a file from datastore and it tries to repeatedly delete files, when one delete task fails it immidiately tries again and it goes on and on.

             

            4. Do you face similar issues when trying to manually do a cross vCenter vMotion of a non-protected VM ?

             

            Cross vCenter vMotion works without an issue, when I initiate Migration with changing compute resources only migration starts, half-way through a new task/event is popping up "Initiate vMotion receive operation" and it completes without any issues as well. VM is properly migrated to other site.