VMware Cloud Community
mclusky
Contributor
Contributor

SRM 6.1.1 site pairing error

Hi,

vCenter VCSA with external PSC (also appliance) 6.0.0 (build 4541948)

SRM 6.1.1 (build 4535903)

The same versions on both (protected and recovery) sites.

Recovery site uses VCSA with embedded PSC.

Additionally VMCA on PSC on protected site is configured as SubCA.

Site pairing task fails with status "Cannot complete login due to an incorrect user name or password" after 5 minutes from start. In vmware-dr... log I can see such messages:

...

2017-01-17T09:00:11.280+01:00 [03904 warning 'HmsProxy'] Dr::Providers::Hbr::HmsProxy::HmsServerMonitor::CheckRemotePair: Unable to find the remote site

2017-01-17T09:00:11.280+01:00 [03904 verbose 'HmsProxy'] Dr::Providers::Hbr::HmsProxy::HmsProxyImpl::SetRemoteInfoFailed: Unable to get remote HMS server info, error=

--> (dr.fault.RemoteSiteNotFound) {

-->    faultCause = (vmodl.MethodFault) null,

-->    msg = ""

--> }

...

2017-01-17T09:00:11.561+01:00 [04172 info 'LocalAuthzServer.PermissionSync' connID=sso-admin-7f6f] Received permission update for user 'VSPHERE.LOCAL\SRM-remote-dceaa31a-b2d2-4d84-a82d-1aa38db5d361' from SRM server.

2017-01-17T09:00:11.561+01:00 [03120 verbose 'AuthzStorageProvider'] [AuthzStorageProvider::WaitForUpdate] Got global permission update: current version: 4, new version 5

2017-01-17T09:00:11.561+01:00 [03904 verbose 'LocalVcServer-PCM' connID=sso-admin-7f6f] [PCM] Using token '4' for pending filter creation

2017-01-17T09:00:11.576+01:00 [04900 verbose 'LocalVcServer-PCM' connID=sso-admin-7f6f opID=15b10f9e] [PCM] Filter creation succeeded for filter token '4'

2017-01-17T09:00:11.576+01:00 [04900 verbose 'LocalVcServer-PCM' connID=sso-admin-7f6f opID=15b10f9e] [PCM] Recording new filter 'sessionId[52096]52793155-ce58-ccc2-a8d0-b554508aa053' for token '4' and connection version '1'

2017-01-17T09:00:11.576+01:00 [02900 info 'LocalAuthzServer.PermissionSync' connID=sso-admin-7f6f] Received permission update for user 'VSPHERE.LOCAL\SRM-remote-dceaa31a-b2d2-4d84-a82d-1aa38db5d361' from VC server.

2017-01-17T09:00:11.576+01:00 [04476 verbose 'LocalVcServer-PCM' connID=sso-admin-7f6f] [PCM] Destroying filter with filter token '4'

2017-01-17T09:00:11.576+01:00 [04476 verbose 'LocalVcServer-PCM' connID=sso-admin-7f6f] [PCM] Starting filter destruction of filter 'sessionId[52096]52793155-ce58-ccc2-a8d0-b554508aa053'

2017-01-17T09:00:11.576+01:00 [04476 info 'DrSiteUserManager' connID=sso-admin-7f6f] The task to assign default privileges for solution user 'SRM-remote-dceaa31a-b2d2-4d84-a82d-1aa38db5d361' for site 'dceaa31a-b2d2-4d84-a82d-1aa38db5d361' finished successfully.

2017-01-17T09:00:11.576+01:00 [03120 verbose 'AuthzStorageProvider'] [AuthzStorageProvider::GetGlobalPermissionUpdate] Get a new permission: DrServiceInstance:0:VSPHERE.LOCAL\SRM-remote-f4f96693-4630-482d-9252-0e41c5a2b5e3

2017-01-17T09:00:11.576+01:00 [02900 verbose 'LocalVcServer-PCM' connID=sso-admin-7f6f] [PCM] Destruction of filter 'sessionId[52096]52793155-ce58-ccc2-a8d0-b554508aa053' succeeded

2017-01-17T09:00:11.639+01:00 [04296 info 'Default'] W32Util_CreateOrReplaceAdminOnlyFile: deleting pre-existing file .\dceaa31a-b2d2-4d84-a82d-1aa38db5d361.p12.

[#1] -->

2017-01-17T09:00:11.639+01:00 [04296 info 'DrSiteUserManager'] Temporary certificate/key pair for solution user 'SRM-remote-0b265a6d-4e3e-43b0-aed9-35f6a9619816' at domain 'vsphere.local', LS++ site 'default-first-site', VC '0093BB51-B459-48E3-A010-36FB267D01B3' and SRM with id 'dceaa31a-b2d2-4d84-a82d-1aa38db5d361' promoted to permanent successfully.

2017-01-17T09:00:11.639+01:00 [03120 verbose 'UserDirectory'] [GetUserInfo] User not found: 'VSPHERE.LOCAL\SRM-remote-f4f96693-4630-482d-9252-0e41c5a2b5e3' isGroup=false

2017-01-17T09:00:11.639+01:00 [03120 warning 'AuthzStorageProvider'] [AuthzStorageProvider::SyncClientCache] ignore the exception for adding global permission for user class Vmacore::Authorize::AuthUserNotFoundException(User VSPHERE.LOCAL\SRM-remote-f4f96693-4630-482d-9252-0e41c5a2b5e3): VSPHERE.LOCAL\SRM-remote-f4f96693-4630-482d-9252-0e41c5a2b5e3

2017-01-17T09:00:11.639+01:00 [02884 verbose 'TaskQueue'] Created FIFO Task Queue.

2017-01-17T09:00:11.639+01:00 [02884 info 'DrTask'] Starting task 'dr.site.SiteManager.pairSrm1'

2017-01-17T09:00:11.639+01:00 [02884 verbose 'PropertyProvider'] RecordOp ASSIGN: info.progress, dr.site.SiteManager.pairSrm1. Applied change to temp map.

2017-01-17T09:00:11.639+01:00 [02736 verbose 'DrTask'] Starting VC task 'com.vmware.vcDr.dr.site.SiteManager.pairSrm'

2017-01-17T09:00:11.639+01:00 [02884 verbose 'DrSiteSiteManager'] Building a connection spec for the local services.

2017-01-17T09:00:11.654+01:00 [02884 info 'RemoteLkpServer' connID=c037] Starting persistent connection monitoring

2017-01-17T09:00:11.654+01:00 [02884 verbose 'RemoteLkpServer' connID=c037] The newly started persistent connection is still waiting to connect

2017-01-17T09:00:11.654+01:00 [02884 verbose 'RemoteLkpServer' connID=c037] Dr::PersistentConnection::ConnectLocked: Attempting to connect

2017-01-17T09:00:11.654+01:00 [04476 verbose 'StubFactory' connID=c037] Event broadcasted

2017-01-17T09:00:11.670+01:00 [04792 verbose 'DrTask'] Created VC task 'com.vmware.vcDr.dr.site.SiteManager.pairSrm:task-8054'

2017-01-17T09:00:11.701+01:00 [03088 verbose 'RemoteLkpServer' connID=c037] Connect succeeded, new connection id="lkp-anonymous"

2017-01-17T09:00:11.701+01:00 [02212 verbose 'StubFactory' connID=c037 ctxID=7cca22] Event broadcasted

2017-01-17T09:00:11.701+01:00 [05060 verbose 'StubFactory' connID=c037] Event broadcasted

2017-01-17T09:00:11.701+01:00 [02212 verbose 'ServerSingletonCache' connID=c037 ctxID=7cca22] Event broadcasted

2017-01-17T09:00:11.701+01:00 [04792 info 'RemoteSsoServer' connID=sso-admin-b822 ctxID=7cca22] Dr::SsoPersistentConnection::SsoPersistentConnection: Created an SSO persistent connection.

2017-01-17T09:00:11.701+01:00 [04792 info 'RemoteVcServer' connID=vc-admin-acb2 ctxID=7cca22] Created VC persistent connection for an admin session.

2017-01-17T09:00:11.701+01:00 [04792 verbose 'RemoteVcServer-PCM' connID=c037 ctxID=7cca22] [PCM] Initialized without an active connection

2017-01-17T09:00:11.701+01:00 [04792 verbose 'TaskQueue' connID=c037 ctxID=7cca22] Created FIFO Task Queue.

2017-01-17T09:00:11.701+01:00 [04792 verbose 'VmDomain' connID=c037 ctxID=7cca22] Created a VmDomainImpl instance.

2017-01-17T09:00:11.701+01:00 [04792 verbose 'TaskQueue' connID=c037 ctxID=7cca22] Created FIFO Task Queue.

2017-01-17T09:00:11.701+01:00 [04792 verbose 'HostDomain' connID=c037 ctxID=7cca22] Initialized Task Queue Service.

2017-01-17T09:00:11.701+01:00 [04792 verbose 'HostDomain' connID=c037 ctxID=7cca22] Initialized Task Queue Service.

2017-01-17T09:00:11.701+01:00 [04792 verbose 'HostDomain' connID=c037 ctxID=7cca22] Initialized Task Queue Service.

2017-01-17T09:00:11.701+01:00 [04792 verbose 'HostDomain' connID=c037 ctxID=7cca22] Created Host Domain.

2017-01-17T09:00:11.701+01:00 [04792 info 'RemoteAuthzServer' connID=authz-admin-43be ctxID=7cca22] Dr::AuthzPersistentConnection::AuthzPersistentConnection: Created an AuthZ persistent connection.

2017-01-17T09:00:11.701+01:00 [04792 info 'RemoteDrServer' connID=dr-admin-9301 ctxID=7cca22] Created DR persistent connection for solution user.

2017-01-17T09:00:11.701+01:00 [04792 verbose 'RemoteDrServer-PCM' connID=c037 ctxID=7cca22] [PCM] Initialized without an active connection

2017-01-17T09:00:11.701+01:00 [04792 verbose 'TaskQueue' connID=c037 ctxID=7cca22] Created FIFO Task Queue.

2017-01-17T09:00:11.701+01:00 [04792 verbose 'PropertyProvider' connID=c037 ctxID=7cca22] RecordOp ASSIGN: info.progress, dr.site.SiteManager.pairSrm1. Applied change to temp map.

2017-01-17T09:00:11.763+01:00 [02212 verbose 'PropertyProvider' connID=c037] RecordOp ASSIGN: info.progress, dr.site.SiteManager.pairSrm1. Applied change to temp map.

2017-01-17T09:00:11.779+01:00 [04748 verbose 'PropertyProvider' connID=c037] RecordOp ASSIGN: info.progress, dr.site.SiteManager.pairSrm1. Applied change to temp map.

2017-01-17T09:00:11.810+01:00 [02212 verbose 'SamlTokenFactory' connID=c037] Setting the connection reset functor.

2017-01-17T09:00:11.810+01:00 [02212 info 'RemoteSsoServer' connID=sso-admin-b822] Resetting the PersistentConnection

2017-01-17T09:00:11.810+01:00 [02212 verbose 'RemoteSsoServer' connID=sso-admin-b822] Installing new stub adapter factory into the PersistentConnection.

2017-01-17T09:00:11.810+01:00 [02212 verbose 'RemoteSsoServer' connID=sso-admin-b822] Invoking StartInternalLocked

2017-01-17T09:00:11.810+01:00 [02212 verbose 'RemoteSsoServer' connID=sso-admin-b822] Starting connection monitoring.

2017-01-17T09:00:11.810+01:00 [02212 verbose 'RemoteSsoServer' connID=sso-admin-b822] Dr::PersistentConnection::ConnectLocked: Attempting to connect

2017-01-17T09:00:11.810+01:00 [02212 info 'RemoteVcServer' connID=vc-admin-acb2] Resetting the PersistentConnection

2017-01-17T09:00:11.810+01:00 [02212 verbose 'RemoteVcServer' connID=vc-admin-acb2] Installing new stub adapter factory into the PersistentConnection.

2017-01-17T09:00:11.810+01:00 [02212 verbose 'RemoteVcServer' connID=vc-admin-acb2] Invoking StartInternalLocked

2017-01-17T09:00:11.810+01:00 [02212 verbose 'RemoteVcServer' connID=vc-admin-acb2] Starting connection monitoring.

2017-01-17T09:00:11.810+01:00 [02212 verbose 'RemoteVcServer' connID=vc-admin-acb2] Dr::PersistentConnection::ConnectLocked: Attempting to connect

2017-01-17T09:00:11.810+01:00 [02212 info 'RemoteAuthzServer' connID=authz-admin-43be] Resetting the PersistentConnection

2017-01-17T09:00:11.810+01:00 [02212 verbose 'RemoteAuthzServer' connID=authz-admin-43be] Installing new stub adapter factory into the PersistentConnection.

2017-01-17T09:00:11.810+01:00 [02212 verbose 'RemoteAuthzServer' connID=authz-admin-43be] Invoking StartInternalLocked

2017-01-17T09:00:11.810+01:00 [02212 verbose 'RemoteAuthzServer' connID=authz-admin-43be] Starting connection monitoring.

2017-01-17T09:00:11.810+01:00 [02212 verbose 'RemoteAuthzServer' connID=authz-admin-43be] Dr::PersistentConnection::ConnectLocked: Attempting to connect

2017-01-17T09:00:11.810+01:00 [02212 info 'RemoteDrServer' connID=dr-admin-9301] Resetting the PersistentConnection

2017-01-17T09:00:11.810+01:00 [02212 verbose 'RemoteDrServer' connID=dr-admin-9301] Installing new stub adapter factory into the PersistentConnection.

2017-01-17T09:00:11.810+01:00 [02212 verbose 'RemoteDrServer' connID=dr-admin-9301] Invoking StartInternalLocked

2017-01-17T09:00:11.810+01:00 [02212 verbose 'RemoteDrServer' connID=dr-admin-9301] Starting connection monitoring.

2017-01-17T09:00:11.810+01:00 [02212 verbose 'RemoteDrServer' connID=dr-admin-9301] Dr::PersistentConnection::ConnectLocked: Attempting to connect

2017-01-17T09:00:11.810+01:00 [02212 verbose 'PropertyProvider' connID=c037] RecordOp ASSIGN: info.progress, dr.site.SiteManager.pairSrm1. Applied change to temp map.

2017-01-17T09:00:11.810+01:00 [02884 verbose 'RemoteSsoServer.ConnHandler' connID=sso-admin-b822] Connecting to SSO server.

2017-01-17T09:00:11.810+01:00 [04748 warning 'RemoteVcServer' connID=vc-admin-acb2] Failed to connect: std::exception 'class Vmacore::NotInitializedException' "Not initialized: VcConnectionHandler SSO"

2017-01-17T09:00:11.810+01:00 [03088 warning 'RemoteDrServer' connID=dr-admin-9301] Failed to connect: std::exception 'class Vmacore::NotInitializedException' "Not initialized: DrConnectionHandler SSO"

2017-01-17T09:00:11.810+01:00 [04476 warning 'RemoteAuthzServer' connID=authz-admin-43be] Failed to connect: std::exception 'class Vmacore::NotInitializedException' "Not initialized: AuthzConnectionHandler SSO"

2017-01-17T09:00:11.904+01:00 [04508 verbose 'SamlTokenFactory' connID=sso-admin-b822 ctxID=37acec3e] Setting signing certificates and broadcasting.

2017-01-17T09:00:11.904+01:00 [03088 trivia 'SsoClient'] opId=52411ecc-97b0-960a-cee6-885b11fbfab3 START operation SecurityTokenServiceImpl::AcquireTokenByCertificate

2017-01-17T09:00:11.904+01:00 [03088 trivia 'SsoClient'] Requested token with spec `TokenSpec [tokenLifetimeSec=28800, delegable=false, delegateTo=<unset>, renewable=false, confirmationType=0,audienceRestriction=, advice=[]'

2017-01-17T09:00:11.982+01:00 [03088 trivia 'SsoClient'] opId=52411ecc-97b0-960a-cee6-885b11fbfab3 END operation SecurityTokenServiceImpl::AcquireTokenByCertificate

2017-01-17T09:00:11.982+01:00 [03088 error 'RemoteSsoServer.ConnHandler'] `anonymous-namespace'::ConnectHandler::GetTokenComplete: Unable to retrieve token from SSO; no token. Exception:

--> Authentication failed: Invalid credentials

...

2017-01-17T09:05:11.829+01:00 [02736 error 'DrSiteSiteManager' connID=c037] Connection check to service of type com.vmware.cis.cs.identity failed:

--> std::exception 'class SsoClient::InvalidCredentialsException' "Authentication failed: Invalid credentials"

2017-01-17T09:05:11.829+01:00 [02736 error 'DrSiteSiteManager' connID=c037] Connection check to service of type com.vmware.cis.vcenterserver failed:

--> std::exception 'class Dr::TimeoutException' "Operation timed out after '300.000000' seconds"

2017-01-17T09:05:11.829+01:00 [02736 error 'DrSiteSiteManager' connID=c037] Connection check to service of type com.vmware.cis.cs.authorization failed:

--> std::exception 'class Dr::TimeoutException' "Operation timed out after '300.000000' seconds"

2017-01-17T09:05:11.829+01:00 [02736 error 'DrSiteSiteManager' connID=c037] Connection check to service of type com.vmware.dr.vcDr failed:

--> std::exception 'class Dr::TimeoutException' "Operation timed out after '300.000000' seconds"

2017-01-17T09:05:11.829+01:00 [02736 info 'RemoteLkpServer' connID=c037] Shutting down connection

Problem sent to VMware support. VMware recommendation is to reinstall vCenter/PSC on both sites.

No DNS resolution, static entries in hosts files on vCenter, PSC and SRM.

User account used for site pairing is member of SSO Administrators group on recovery site.

Reply
0 Kudos
6 Replies
admin
Immortal
Immortal

It's worth checking the sso logs around the same timestamp - there may be more detailed information about the failure.

Reply
0 Kudos
mclusky
Contributor
Contributor

Thnx for your suggestion. I'll follow it.

Can you tell mi more which particular logs to check ad from which sites (pairing procedure is being initiated from protected site where VCSA with external PSC is used).

Reply
0 Kudos
admin
Immortal
Immortal

Check the sso log directory of the Platform Services Controller on the remote site - the site you are trying to pair with:

ProgramData\VMware\vCenterServer\logs\sso

vmware-sts-idmd.log may have useful information.

mclusky
Contributor
Contributor

Thnx. According do VMware Support recovery site was reinstalled.

After reinstallation I could pair sites.

Reply
0 Kudos
johnvmx
Contributor
Contributor

What exactly you reinstalled?

Reply
0 Kudos
edmoncu
Contributor
Contributor

Same question here

*update* :

  • reinstalled SRM on my master vcenter
  • restarted the pairing from the master SRM
  • SRM started at 55% already and resumed... now it's at 62% and counting

*update #2* :

  • pairing failed at 70% due to credentials error.
    • reinstalled vcenter and SRM on both sites and re-pair'ed. pairing apparently was supposed to be very quick
Reply
0 Kudos