VMware Cloud Community
vbhost
Contributor
Contributor

Failed to retrieve pairs from site recovery manager server at https://X.X.X.X:8095/dr

Hello All

I am using SRM 5.8.1 build  2991419. Recently, we are having issues where SRM pair is broken using our domain accounts. With Administrator@vsphere.local, I can browse the pair, configured network mappings, browse protection groups, etc. However, with our domain accounts, it gives me the error "Failed to connect to site recovery manager, Cause : Failed to retrieve pairs from site recovery manager server at https://X.X.X.X:8095/dr ". Our SSO identity source is defaulted to type "Local OS" , but we do have "Active Directory (Integrated Windows Authentication)" set as well for the entire root domain.  We have the vcenter servers in liked mode.

1. I have checked the communication from from the VC server to the SRM server, and both are listening, as well as established on port 8095.

2. Telnet also seems to work fine on that port. Uninstalled ad re-installed SRM onto a new database, still same issue, which leads me to believe something is wrong at the AD authentication site. We have our AD groups added to the permission, I have even added my account individually to the permissions section, still same issue.

3. Below are the SRM logs which I see.  I have already added the parameter <WaitForUpdatesTimeout>300</WaitForUpdatesTimeout> to the SRM config file and restarted the SRM services, since the below logs state that there has been a timeout of 900 seconds. So, changed it to lesser time of 300, so that in case there is any firewall, where its timing out, atleast SRM's value is lesser than that.

4. I have tried adding "Active Directory as a LDAP server" with base dn configured, testing the connection is also successful, restarted SRM service, still same issue.

5. Uninstalled and Re-installed SRM onto a new database altogether, still same issue, Administrator@vsphere.local works, but not with domain accounts.

Can someone, please assist me, on what is going wrong.

2017-10-01T02:51:33.864-04:00 [00268 info 'LocalVC' connID=vc-admin-6615 opID=410dea70] [PCM] Received NULL results from PropertyCollector::WaitForUpdatesEx due to timeout of 300 seconds

2017-10-01T02:51:49.193-04:00 [00444 verbose 'LocalSiteStatus'] Free disk space: 28575 Mb

2017-10-01T02:51:49.193-04:00 [00444 verbose 'LocalSiteStatus'] CPU usage: 0 %

2017-10-01T02:51:49.193-04:00 [00444 verbose 'LocalSiteStatus'] Available memory: 2505 Mb

2017-10-01T02:52:33.068-04:00 [00444 info 'RemoteDR' connID=dr-admin-615b opID=2a28125c] [PCM] Received NULL results from PropertyCollector::WaitForUpdatesEx due to timeout of 300 seconds

2017-10-01T02:52:34.130-04:00 [00444 info 'RemoteVC' connID=vc-admin-af58 opID=12f9ba61] [PCM] Received NULL results from PropertyCollector::WaitForUpdatesEx due to timeout of 300 seconds

2017-10-01T02:52:49.208-04:00 [00612 verbose 'LocalSiteStatus'] Free disk space: 28574 Mb

2017-10-01T02:52:49.208-04:00 [00612 verbose 'LocalSiteStatus'] CPU usage: 1 %

2017-10-01T02:52:49.208-04:00 [00612 verbose 'LocalSiteStatus'] Available memory: 2504 Mb

2017-10-01T02:52:59.646-04:00 [00268 error 'SoapAdapter.HTTPService'] Failed to read request; stream: <SSL(<io_obj p:0x000000000b87c338, h:-1, <TCP '0.0.0.0:0'>, <TCP '0.0.0.0:0'>>)>, error: class Vmacore::TimeoutException(Operation timed out)

2017-10-01T02:53:49.224-04:00 [00444 verbose 'LocalSiteStatus'] Free disk space: 28574 Mb

2017-10-01T02:53:49.224-04:00 [00444 verbose 'LocalSiteStatus'] CPU usage: 0 %

2017-10-01T02:53:49.224-04:00 [00444 verbose 'LocalSiteStatus'] Available memory: 2504 Mb

2017-10-01T02:54:49.240-04:00 [04976 verbose 'LocalSiteStatus'] Free disk space: 28574 Mb

2017-10-01T02:54:49.240-04:00 [04976 verbose 'LocalSiteStatus'] CPU usage: 1 %

2017-10-01T02:54:49.240-04:00 [04976 verbose 'LocalSiteStatus'] Available memory: 2504 Mb

2017-10-01T02:55:49.256-04:00 [04472 verbose 'LocalSiteStatus'] Free disk space: 28574 Mb

2017-10-01T02:55:49.256-04:00 [04472 verbose 'LocalSiteStatus'] CPU usage: 0 %

2017-10-01T02:55:49.256-04:00 [04472 verbose 'LocalSiteStatus'] Available memory: 2500 Mb

2017-10-01T02:56:25.412-04:00 [04472 verbose 'PropertyProvider'] RecordOp ASSIGN: content.about.name, DrServiceInstance

2017-10-01T02:56:26.349-04:00 [01296 verbose 'QsProvider' connID=7cfa] Created entry iterator [class Dr::QueryService::ObjectEntryIterator:000000000B54D920] from generation 1251. Max generation 1252

2017-10-01T02:56:26.349-04:00 [01296 verbose 'QsProvider' connID=7cfa] All data up to generation '1252' is sent to IS

2017-10-01T02:56:26.927-04:00 [03980 verbose 'Licensing'] Asset in sync.

2017-10-01T02:56:27.537-04:00 [04460 verbose 'ThreadPoolStats']

--> Vmacore thread pool stats:

--> /ThreadPool/Vmacore/IOThreads/total 0

--> /ThreadPool/Vmacore/IdleThreads/total 10

--> /ThreadPool/Vmacore/MaxIOThreads/total 401

--> /ThreadPool/Vmacore/MaxWorkerThreads/total 200

--> /ThreadPool/Vmacore/MinIOThreads/total 2

--> /ThreadPool/Vmacore/MinWorkerThreads/total 2

--> /ThreadPool/Vmacore/RunningThreads/total 10

--> /ThreadPool/Vmacore/WorkerThreads/total 0

-->

2017-10-01T02:56:33.865-04:00 [02780 info 'LocalVC' connID=vc-admin-6615 opID=410dea70] [PCM] Received NULL results from PropertyCollector::WaitForUpdatesEx due to timeout of 300 seconds

2017-10-01T02:56:49.256-04:00 [02780 verbose 'LocalSiteStatus'] Free disk space: 28574 Mb

2017-10-01T02:56:49.256-04:00 [02780 verbose 'LocalSiteStatus'] CPU usage: 1 %

2017-10-01T02:56:49.256-04:00 [02780 verbose 'LocalSiteStatus'] Available memory: 2500 Mb

2017-10-01T02:57:33.068-04:00 [03980 info 'RemoteDR' connID=dr-admin-615b opID=2a28125c] [PCM] Received NULL results from PropertyCollector::WaitForUpdatesEx due to timeout of 300 seconds

2017-10-01T02:57:34.146-04:00 [03980 info 'RemoteVC' connID=vc-admin-af58 opID=12f9ba61] [PCM] Received NULL results from PropertyCollector::WaitForUpdatesEx due to timeout of 300 seconds

2017-10-01T02:57:49.271-04:00 [00268 verbose 'LocalSiteStatus'] Free disk space: 28574 Mb

2017-10-01T02:57:49.271-04:00 [00268 verbose 'LocalSiteStatus'] CPU usage: 0 %

2017-10-01T02:57:49.271-04:00 [00268 verbose 'LocalSiteStatus'] Available memory: 2501 Mb

2017-10-01T02:58:49.271-04:00 [04472 verbose 'LocalSiteStatus'] Free disk space: 28574 Mb

2017-10-01T02:58:49.271-04:00 [04472 verbose 'LocalSiteStatus'] CPU usage: 1 %

2017-10-01T02:58:49.271-04:00 [04472 verbose 'LocalSiteStatus'] Available memory: 2499 Mb

2017-10-01T02:59:35.365-04:00 [00444 error 'SoapAdapter.HTTPService'] Failed to read request; stream: <SSL(<io_obj p:0x000000000b87c338, h:-1, <TCP '0.0.0.0:0'>, <TCP '0.0.0.0:0'>>)>, error: class Vmacore::TimeoutException(Operation timed out)

2017-10-01T02:59:49.272-04:00 [02780 verbose 'LocalSiteStatus'] Free disk space: 28574 Mb

2017-10-01T02:59:49.272-04:00 [02780 verbose 'LocalSiteStatus'] CPU usage: 0 %

2017-10-01T02:59:49.272-04:00 [02780 verbose 'LocalSiteStatus'] Available memory: 2501 Mb

2017-10-01T03:00:49.272-04:00 [02780 verbose 'LocalSiteStatus'] Free disk space: 28574 Mb

2017-10-01T03:00:49.272-04:00 [02780 verbose 'LocalSiteStatus'] CPU usage: 1 %

2017-10-01T03:00:49.272-04:00 [02780 verbose 'LocalSiteStatus'] Available memory: 2501 Mb

2017-10-01T03:01:25.428-04:00 [04472 verbose 'PropertyProvider'] RecordOp ASSIGN: content.about.name, DrServiceInstance

2017-10-01T03:01:25.506-04:00 [02140 verbose 'QsProvider' connID=7cfa] Created entry iterator [class Dr::QueryService::ObjectEntryIterator:000000000B54CF00] from generation 1252. Max generation 1253

2017-10-01T03:01:25.506-04:00 [02140 verbose 'QsProvider' connID=7cfa] All data up to generation '1253' is sent to IS

2017-10-01T03:01:26.928-04:00 [03980 verbose 'Licensing'] Asset in sync.

2017-10-01T03:01:27.537-04:00 [04460 verbose 'ThreadPoolStats']

0 Kudos
2 Replies
vbhost
Contributor
Contributor

Anyone there ?

0 Kudos
branav
Contributor
Contributor

Hello vbhost,

Does this logs captured after you updated <WaitForUpdatesTimeout> to the vmware-dr.xml file ?

Because, The logs still says timeout in 300 seconds. Make sure whether you have updated the correct vmware-dr.xml file.

2017-10-01T02:57:34.146-04:00 [03980 info 'RemoteVC' connID=vc-admin-af58 opID=12f9ba61] [PCM] Received NULL results from PropertyCollector::WaitForUpdatesEx due to timeout of 300 seconds

For reference: https://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=20403...

0 Kudos