VMware Cloud Community
tobiashansen
Contributor
Contributor

Local srm plug-in lost connection to remote srm server during step reset storage post test

setup:

2 x hp r710

2 x equallogic disk systems.

Vsphere 4.0 + SRM 4

Equallogic SRA

I have testet installing a few vCenters with SRM where the same issue occurs in the different setups:

The setup works as expected and the test failover runs as expected.

Problem:

During the roll back of the last step is to "Reset storage post test"

During this step vmware issues commands to remove the promoted replicated volume on equallogic and to rescan the vmware storage adapter.

The progress bar gets to 60 % of task completion in the gui.

After almost 120 seconds of run time the 2 SRM sites loses connection to eachother. The error pops up "Local srm plug-in lost connection to remote site"

In the background the Equallogic GUI shows that the volume is removed after approximately 5 minutes.

The vmware gui shows that the Rescan all HBA process can take a few minutes longer.

If I try to reconnect to SRM before completion I can see that the progress bar has moved to 75 % but I am most likely timed out again.

After a test has timed out and the volume has been correctly dismounted you can re-connect to SRM. Everything appears to be working.

You cannot add new groups or run other tests before both sites' SRM windows servers' SRM service are restarted.

All attempts to do so cast off the error: "Access to perform the operation was denied."

A test casting off this error only runs for 1 second before stopping

- When the services has been restarted everything works again up to the point where a test is completed and post test step again tries to Reset storage.

Which SRM timeout settings can be related to the Reset storage step and where do I modify these. ?

Which vmware timeout settings can be related to the Reset storage step and where do I modify these. ?

Sitenote:

As I understand the step Reset storage: At this point all that has to be done is to demote the promoted volume. In Equallogic this is performed in just a few seconds. After that the rescan process should be started.

However when I look at the process bar in vmware the rescan appears to be started before the volume is gone. Can this be modified so the rescan is dependent on a completion of the dismount in Equallogic. ?

Thanks for any help and suggestions.

0 Kudos
0 Replies