VMware Cloud Community
sarmaskumar
Contributor
Contributor

SRM Reprotect is getting failed after recover to DR site

Dear All,

I have a Vmware SRM LAB setup with SRM 5.1 and netapp Simulator 8.1.2.

The setup is Implemented successfully and working fine.

I tested and do the recovery to the DR site. Recovery works successfully.

After that When I try to reprotect the site , the reprotect is getting failed, showing error as follows.

1.1.1. Configure Array-based StorageWarning - Operation timed out: 300 seconds2013-01-14 17:01:42 (UTC 0)2013-01-14 17:07:27 (UTC 0)
1.1.2. Configure VR ReplicationSuccess2013-01-14 17:01:41 (UTC 0)2013-01-14 17:01:41 (UTC 0)
1.1.2.1. WEB1Success2013-01-14 17:01:41 (UTC 0)2013-01-14 17:01:41 (UTC 0)
2. Configure Protection to Reverse DirectionError - The operation was only partially completed for the protection group 'PR-GROUP-WEB1' since a protected VM belonging to it was not successful in completing the operation. Cannot protect virtual machine 'WEB1' because its config file '[snap-1fabfe7d-PR-SAN] WEB1/WEB1.vmx' is located on a non-replicated or non-protected datastore.2013-01-14 17:07:28 (UTC 0)2013-01-14 17:07:30 (UTC 0)
2.1. Protection Group PR-GROUP-WEB1Error - The operation was only partially completed for the protection group 'PR-GROUP-WEB1' since a protected VM belonging to it was not successful in completing the operation. Cannot protect virtual machine 'WEB1' because its config file '[snap-1fabfe7d-PR-SAN] WEB1/WEB1.vmx' is located on a non-replicated or non-protected datastore.2013-01-14 17:07:28 (UTC 0)2013-01-14 17:07:30 (UTC 0)
2.1.1. Configure ProtectionError - The operation was only partially completed for the protection group 'PR-GROUP-WEB1' since a protected VM belonging to it was not successful in completing the operation.2013-01-14 17:07:28 (UTC 0)2013-01-14 17:07:30 (UTC 0)
2.1.2. Configure VMs ProtectionError - Cannot protect virtual machine 'WEB1' because its config file '[snap-1fabfe7d-PR-SAN] WEB1/WEB1.vmx' is located on a non-replicated or non-protected datastore.2013-01-14 17:07:29 (UTC 0)2013-01-14 17:07:29 (UTC 0)
2.1.2.1. WEB1Error - Cannot protect virtual machine 'WEB1' because its config file '[snap-1fabfe7d-PR-SAN] WEB1/WEB1.vmx' is located on a non-replicated or non-protected datastore.2013-01-14 17:07:29 (UTC 0)2013-01-14 17:07:29 (UTC 0)

Is there any option to increase the timeout of array based storage replication?

But after sometime, if I tried to reprotect the site, it getting protected successfully.

please find the attached files for success and failure reprotect task.

0 Kudos
12 Replies
memaad
Virtuoso
Virtuoso

Hi,

As you pointed, I think it is just matter of time it take to re-sync storage after failover. I think this KB might help you

http://kb.vmware.com/kb/1032752

Regards

Mohammed

Mohammed | Mark it as helpful or correct if my suggestion is useful.
0 Kudos
ateaJSD
Contributor
Contributor

Hi

I am not sure that the problem is timeout.

We tested SRM 5.0.1 and NetApp SRA 2.0.1 and recovery worked fine on 6 NFS stores and approx. 60 RAW luns. But we never got reprotect to work. We have had supportcases at VMware and NetApp and both agree that it is a problem with the NetApp SRA. And no solution yet.

Regards

André

0 Kudos
Igor_The_Great
Enthusiast
Enthusiast

The error seems to be clear - "located on a non-replicated or non-protected datastore."  Do you have DRS enabled?

Also you can increase the timeout value to see if it helps...

-Igor If you found this or any other answer useful please consider the use of the Helpful or correct buttons to award points.
0 Kudos
sarmaskumar
Contributor
Contributor

We don't have DRS in the Infrastructure. Anyone can pinpoint which timeout value to be changed in the advanced settings of SRM.

0 Kudos
stuartclements
Community Manager
Community Manager

0 Kudos
Igor_The_Great
Enthusiast
Enthusiast

How did the VM end up on the unprotected datastore?  Is the replication mapping configured both ways?

-Igor If you found this or any other answer useful please consider the use of the Helpful or correct buttons to award points.
0 Kudos
sarmaskumar
Contributor
Contributor

Dear igor,

The replication mapped on both ways,  the issue is happening when we try to reprotect the site after recovery. but after 2 or 3 minutes, if rerun the reprotect, it will finish successfully.

0 Kudos
Igor_The_Great
Enthusiast
Enthusiast

Please be patient, I'm just trying to figure out your environment configurations... Smiley Happy

Before reprotect the datastore in question that has the vm, does it have a weird name that starts with SNAP-XXXXXX

If answer is yes, there is an advanced setting to force removal, upon successful completion of recovery of the snap-xxxx prefix

you can set it by Selecting the storageProvider.fixRecoveredDatastoreNames check box.

-Igor If you found this or any other answer useful please consider the use of the Helpful or correct buttons to award points.
0 Kudos
fboulianne
Enthusiast
Enthusiast

i got the same problem as the original poster

The solution i've found was to manualy do a refresh of the array manager devices while the reprotect is running

it seems to refresh too soon and misses the replication because it list the volume is seen as broken on the source and in the wrong direction on the target

0 Kudos
iforbes
Hot Shot
Hot Shot

Hi fboulianne. I'm seeing the exact same thing. I had to manually do a refresh of the Array managers -which is not ideal. Did you change some timeout to get around this refresh workaround?

Thanks

Ian

0 Kudos
admin
Immortal
Immortal

Sounds like this was a specific bug: Performing a reprotect operation in vCenter Site Recovery Manager fails with NetApp FAS/V-Series (SR...

What version of SRA do you have?

As far as I know there isn't any particular parameter that would delay the discover devices operation, to avoid this problem.

0 Kudos
iforbes
Hot Shot
Hot Shot

Hi p_hall. That bug looks to definitely be the culprit. We're running the 3.0 SRA for cmode though. I doubt it's been resolved.

0 Kudos